[2] pykan 라이브러리 사용법 : API Demo

카테고리 없음 2024. 10. 14. 01:42

Kolmogorov-Arnold Network(KAN)을 가장 쉽게 사용할 수 있는 방법! pykan 라이브러리 사용법에 대해 정리해보고자 한다.

※ 모든 내용은 https://kindxiaoming.github.io/pykan/intro.html 의 내용을 번역한 것으로 모든 저작권은 해당 페이지에 있습니다. 부정확한 표현이 있을 수 있으니 원문 페이지를 꼭 참고해주세요!

API 1 : Indexing

from kan import *

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

model = KAN(width = [2, 3, 2, 1], noise_scale = 0.3, device = device)
x = torch.normal(0, 1, size = (100, 2)).to(device)
model(x)
beta = 100
model.plot(beta = beta)

Indexing of edges(activation functions)

model.fix_symbolic(0, 0, 0, 'sin')
model.plot(beta = beta)
model.unfix_symbolic(0, 0, 0)

model.fix_symbolic(0,0,1,'sin')
model.plot(beta=beta)
model.unfix_symbolic(0,0,1)

model.fix_symbolic(0,1,0,'sin')
model.plot(beta=beta)
model.unfix_symbolic(0,1,0)

Indexing of nodes(neurons)

model.remove_node(1, 0)
model.plot(beta=beta)

Indexing of layers

# KAN spline layers are refererred to as act_fun
# KAN symbolic layers are referred to as symbolic_fun
model = KAN(width = [2, 3, 5, 1])

i = 0
print(model.act_fun[i])
print(model.symbolic_fun[i])

for i in range(3):
  print(model.act_fun[i].in_dim, model.act_fun[i].out_dim)
  print(model.symbolic_fun[i].in_dim, model.symbolic_fun[i].out_dim)
  >>
  KANLayer(
  (base_fun): SiLU()
)
Symbolic_KANLayer()
2 3
2 3
3 5
3 5
5 1
5 1

* KAN의 spline layer : act_fun이라는 이름으로 참조되며, 활성화 함수와 관련된 레이어를 나타낸다.

* KAN의 symbolic layer : symbolic_fun이라는 이름으로 참조되며, 수학적 함수를 통해 데이터를 처리하는 레이어로 다항식이나 기호 기반 변환을 수행한다.

model.act_fun[i].grid
model.act_fun[i].coef
model.symbolic_fun[i].funs_name
model.symbolic_fun[i].mask

model.act_fun[i].grid : i번째 spline layer의 grid 정보 확인
model.act_fun[i].coef : i번째 spline layer의 계수 확인
model.symbolic_fun[i].funs_name : i번째 symbolic layer의 함수 이름 확인
model.symbolic_fun[i].mask : i번째 symbolic layer의 마스크 확인. 특정 뉴런의 활성화를 True or False로 알 수 있다.

API 2 : Plotting

from kan import *

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

# create a KAN: 2D inputs, 1D output, and 5 hidden neurons. cubic spline (k=3), 5 grid intervals (grid=5).
model = KAN(width=[2,5,1], grid=3, k=3, seed=1, device=device)

# create dataset f(x,y) = exp(sin(pi*x)+y^2)
f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
dataset = create_dataset(f, n_var=2, device=device)
dataset['train_input'].shape, dataset['train_label'].shape

# plot KAN at initialization
model(dataset['train_input']);
model.plot(beta=100)

# 변수 이름 및 제목 설정
model.plot(beta = 100, in_vars = [r'$\alpha$', 'x'], out_vars = ['y'], title = 'My KAN')

# train the model
model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01);

$\beta$는 활성화 함수의 투명도를 조절한다. $\beta$값이 클수록 더 많은 활성화 함수가 보여진다. 보통은 중요한 연결만 시각적으로 두드러지게 보이도록 하기 때문에 적절한 $\beta$의 값을 설정하는 것이 중요하다.

투명도는 $scale \times \beta$로 설정되며, 여기서 $scale$은 활성화 함수의 크기 (metric = 'forward_u'), 정규화된 크기(metric = 'forward_n') 또는 피처 기여도 점수(metric = 'backward')일 수 있다.

기본 세팅은 $\beta = 3$이고, $metric = 'backward'$로 설정된다.

왼쪽부터 metric = 'forward_n', 'forward_u', 'backward'

model = model.prune()
model.plot()

model.plot(scale=0.5) # 0.5 기준 scale의 값이 커지면 그림이 커진다

line과 더불어 sample distribution을 보고 싶다면 sample = True로 설정하면 된다.

model.plot(sample = True)

model.get_act(dataset['train_input'][:20])
model.plot(sample=True)

model.fix_symbolic(0,1,0,'x^2')
>>
Best value at boundary.
r2 is 0.9899926781654358
saving model version 0.3
tensor(0.9900)

만약 함수가 symbolic이 세팅되면, 이 함수는 붉은 색으로 표시된다.

model.set_mode(0,1,0,mode='ns')
model.plot(beta=100)

만약 함수가 symbolic 및 numeric이 되면, 이 함수는 보라색으로 표시된다.

API 3 : Extracting activation functions

from kan import *
import matplotlib.pyplot as plt

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

# create a KAN: 2D inputs, 1D output, and 5 hidden neurons. cubic spline (k=3), 5 grid intervals (grid=5).
model = KAN(width=[2,5,1], grid=3, k=3, seed=1, device=device)
x = torch.normal(0,1,size=(100,2)).to(device)
model(x)
model.plot(beta=100)

여기서 특정 활성화 함수를 부르기 위해 get_fun 함수를 사용한다.

l = 0
i = 0
j = 3
x, y = model.get_fun(l,i,j)

model.get_range(l, i, j)
>> 
x range: [-2.40 , 2.44 ]
y range: [-0.19 , 0.16 ]
(array(-2.4043953, dtype=float32),
 array(2.4422815, dtype=float32),
 array(-0.19056274, dtype=float32),
 array(0.15698983, dtype=float32))

API 4 : Initialization

각 활성화 함수는 아래와 같이 초기화된다. $b(x)$는 기본함수(silu)이며 다른 값으로 설정할 수 있다.

- scale_sp는 N(0,noise_scale^2)에서 샘플링된다.

- scale_base는 N(scale_base_mu, scale_base_sigma^2)에서 샘플링된다.

sparse initialization : sparse_init이 True로 설정되면, 대부분의 scale_sp와 scale_base값이 0으로 설정된다.

from kan import KAN, create_dataset_from_data
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

model = KAN(width = [2, 5, 1], grid = 5, k = 3, seed = 0, device = device)
x = torch.normal(0, 1, size = (100, 2)).to(device)
model(x)
model.plot()

여기서 모델만 바꿔서 실험해볼 예정이다.

1. 활성화 함수의 선형 초기

model = KAN(width = [2,5, 1], grid = 5, k = 3, seed = 0, base_fun = 'identity')

2. 노이즈가 있는 스플라인 초기화(권장하지 않음)

model = KAN(width=[2,5,1], grid=5, k=3, seed=0, noise_scale=0.3, device=device

model = KAN(width=[2,5,1], grid=5, k=3, seed=0, noise_scale=10., device=device)

3. scale_base_mu와 scale_base_sigma 설정

model = KAN(width=[2,5,1], grid=5, k=3, seed=0, scale_base_mu=5, scale_base_sigma=0, device=device)

model = KAN(width=[2,5,1], grid=5, k=3, seed=0, sparse_init=True, device=device)

API 5 : Grid

스플라인을 정해진 영역에서만 함수를 근사하며, 학습 중 활성화 범위가 변함에 따라 그리드를 적절하게 업데이트 해야한다. 우선 스플라인을 어떻게 매개변수화하는지를 살펴보자.

from kan.spline import B_batch
import torch
import matplotlib.pyplot as plt
import numpy as np
from kan.spline import extend_grid

# consider a 1D example.
# Suppose we have grid in [-1,1] with G intervals, spline order k
G = 5
k = 3
grid = torch.linspace(-1,1,steps=G+1)[None,:]
>> tensor([[-1.0000, -0.6000, -0.2000,  0.2000,  0.6000,  1.0000]])
grid = extend_grid(grid, k_extend=k)
>> tensor([[-2.2000, -1.8000, -1.4000, -1.0000, -0.6000, -0.2000,  0.2000,  0.6000,
          1.0000,  1.4000,  1.8000,  2.2000]])

# and we have sample range in [-1,1]
x = torch.linspace(-1,1,steps=1001)[None,:]

basis = B_batch(x, grid, k=k)
# basis shape : [1, 1001, 8] = [batch, in_dim, G + k]

for i in range(G+k):
    plt.plot(x[0].detach().numpy(), basis[0,:,i].detach().numpy())

plt.legend(['B_{}(x)'.format(i) for i in np.arange(G+k)])
plt.xlabel('x')
plt.ylabel('B_i(x)')

kan.spline.B_batch : evaluate x on B-spline bases

Args:

x2D torch.tensor
inputs, shape (number of splines, number of samples)
grid2D torch.tensor
grids, shape (number of splines, number of grid points)
kint
the piecewise polynomial order of splines.
extendbool
If True, k points are extended on both ends. If False, no extension (zero boundary condition). Default: True
devicestr
devicde

Returns:

spline values3D torch.tensor
shape (batch, in_dim, G+k). G: the number of grid intervals, k: spline order.

Example

>>> from kan.spline import B_batch
>>> x = torch.rand(100,2)
>>> grid = torch.linspace(-1,1,steps=11)[None, :].expand(2, 11)
>>> B_batch(x, grid, k=3).shape

G+k개의 B-스플라인 기본 함수들이 있으며, 함수는 이러한 기본 함수들의 선형 결합으로 구성된다.

이를 확인하기 위해 [1, 1] KAN을 생각해보자.(1D spline과 동일)

from kan import KAN

model = KAN(width=[1,1], grid=G, k=k)
# obtain coefficients c_i
model.act_fun[0].coef
>> tensor([[[ 0.0479,  0.0161, -0.0212, -0.0134,  0.0396, -0.0596,  0.0234,
           0.0625]]], requires_grad=True)
assert(model.act_fun[0].coef[0].shape[1] == G+k)

# the model forward
model_output = model(x[0][:,None])

# spline output
spline_output = torch.einsum('j,ij->i',model.act_fun[0].coef[0][0], basis[0])[:,None]

torch.mean((model_output - spline_output)**2)
>>
checkpoint directory created: ./model
saving model version 0.0
tensor(0.0040, grad_fn=<MeanBackward0>)

torch.einsum('j, ij->i', model.act_fun[0].coef[0][0], basis[0]) :

'j, ij->i')

j : 첫번째 텐서 model.act_fun[0].coef[0][0]의 축 이름

ij : 두 번째 텐서 basis[0]의 축 이름

->i : 결과 텐서의 차원으로 i 차원만 남게 된다.

[:, None] : 텐서의 차원을 추가하여 (n, )에서 (n, 1)로 변환한다.

import torch

# 예제 계수 벡터 (a, b, c)
coefficients = torch.tensor([2.0, 0.5, -1.0])

# 예제 스플라인 기본 함수 (3개의 샘플, 각 샘플은 3개의 기본 함수 값 포함)
basis = torch.tensor([
    [1.0, 0.5, -0.5],   # 첫 번째 샘플
    [0.0, 1.0, 0.5],    # 두 번째 샘플
    [-0.5, -1.0, 1.5]   # 세 번째 샘플
])

# torch.einsum을 사용한 선형 결합 계산
spline_output = torch.einsum('j,ij->i', coefficients, basis)[:, None]

# 결과 출력
print("계수 벡터:", coefficients)
print("스플라인 기본 함수 (basis):\n", basis)
print("스플라인 출력:\n", spline_output)

계수 벡터: tensor([ 2.0000,  0.5000, -1.0000])
스플라인 기본 함수 (basis):
 tensor([[ 1.0000,  0.5000, -0.5000],
        [ 0.0000,  1.0000,  0.5000],
        [-0.5000, -1.0000,  1.5000]])
스플라인 출력:
 tensor([[ 2.7500],
        [ 0.0000],
        [-3.0000]])

여기서 model_output과 spline_output은 같지 않은데, 그 이유는 우리가 활성화 함수를 모델링할 때, residual function $b(x)$와 spline 함수의 합으로 정의했기 때문이다.

따라서 우리는 residual output 또한 고려해주어야 한다.

# residual output
residual_output = torch.nn.SiLU()(x[0][:,None])
scale_base = model.act_fun[0].scale_base
scale_sp = model.act_fun[0].scale_sp
torch.mean((model_output - (scale_base * residual_output + scale_sp * spline_output))**2)
>>
tensor(0., grad_fn=<MeanBackward0>)

만약 그리드가 데이터와 일치하지 않는 경우에는 어떻게 해야할까? 예를 들어 그리드가 [-1, 1] 범위에 있지만 데이터가 이 밖에 있는 경우가 있을 수 있다. 이런 경우 update_grid_from_sample을 사용하여 샘플에 맞게 그리드를 조정할 수 있다. 이 그리드 업데이트는 모든 레이어의 모든 스플라인에 일괄 적용된다.

model = KAN(width=[1,1], grid=G, k=k)
print(model.act_fun[0].grid) # by default, the grid is in [-1,1]
x = torch.linspace(-10,10,steps = 1001)[:,None]
model.update_grid_from_samples(x)
print(model.act_fun[0].grid) # now the grid becomes in [-10,10]. We add a 0.01 margin in case x have zero variance
>>
Parameter containing:
tensor([[-2.2000, -1.8000, -1.4000, -1.0000, -0.6000, -0.2000,  0.2000,  0.6000,
          1.0000,  1.4000,  1.8000,  2.2000]])
Parameter containing:
tensor([[-22.0000, -18.0000, -14.0000, -10.0000,  -6.0000,  -2.0000,   2.0000,
           6.0000,  10.0000,  14.0000,  18.0000,  22.0000]])

model = KAN(width=[1,1], grid=G, k=k)
print(model.act_fun[0].grid) # by default, the grid is in [-1,1]
x = torch.linspace(-0.5,0.5,steps = 1001)[:,None]
model.update_grid_from_samples(x)
print(model.act_fun[0].grid) # now the grid becomes in [-10,10]. We add a 0.01 margin in case x have zero variance
>>
Parameter containing:
tensor([[-2.2000, -1.8000, -1.4000, -1.0000, -0.6000, -0.2000,  0.2000,  0.6000,
          1.0000,  1.4000,  1.8000,  2.2000]])
Parameter containing:
tensor([[-1.1000, -0.9000, -0.7000, -0.5000, -0.3000, -0.1000,  0.1000,  0.3000,
          0.5000,  0.7000,  0.9000,  1.1000]])

또한 그리드는 (1) 균일 그리드와 (2) 적응형 그리드가 있다. 적응형 그리드는 샘플 분포를 기반으로 각 구간에 대략 동일한 수의 샘플이 들어가도록 하는 그리드이다. 이 두 가지 옵션을 설정하기 위해 pykan에서는 grid_eps 파라미터를 제공한다. grid_eps = 1은 균일 그리드를, grid_eps = 0는 적응형 그리드를 나타내며 기본값은 균일 그리드이다. 다른 옵션도 가능하지만 이 부분은 이 포스팅에서는 다루지 않는다.

# uniform grid
model = KAN(width=[1,1], grid=G, k=k)
print(model.act_fun[0].grid) # by default, the grid is in [-1,1]
x = torch.normal(0,1,size=(1000,1))
model.update_grid_from_samples(x)
print(model.act_fun[0].grid) # now the grid becomes in [-10,10]. We add a 0.01 margin in case x have zero variance
>>
tensor([[-2.2000, -1.8000, -1.4000, -1.0000, -0.6000, -0.2000,  0.2000,  0.6000,
          1.0000,  1.4000,  1.8000,  2.2000]])
Parameter containing:
tensor([[-7.4371, -6.0845, -4.7319, -3.3793, -0.8574, -0.2884,  0.2832,  0.8376,
          3.3837,  4.7363,  6.0889,  7.4415]])

# adaptive grid based on sample distribution
model = KAN(width=[1,1], grid=G, k=k, grid_eps = 0.)
print(model.act_fun[0].grid) # by default, the grid is in [-1,1]
x = torch.normal(0,1,size=(1000,1))
model.update_grid_from_samples(x)
print(model.act_fun[0].grid) # now the grid becomes in [-10,10]. We add a 0.01 margin in case x have zero variance
>>
Parameter containing:
tensor([[-2.2000, -1.8000, -1.4000, -1.0000, -0.6000, -0.2000,  0.2000,  0.6000,
          1.0000,  1.4000,  1.8000,  2.2000]])
Parameter containing:
tensor([[-7.4371, -6.0845, -4.7319, -3.3793, -0.8336, -0.2805,  0.2751,  0.8132,
          3.3837,  4.7363,  6.0889,  7.4415]])

API 6 : Training Hyperparameters

Regularization은 KAN을 더 희소하게 만들어 해석 가능성을 높여준다. 이 과정에서 일부 하이퍼파라미터의 튜닝이 필요할 수 있다. 하이퍼파라미터가 학습에 어떤 영향을 미치는지 알아보자.

from kan import *
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
dataset = create_dataset(f, n_var=2, device=device)
dataset['train_input'].shape, dataset['train_label'].shape
#>> (torch.Size([1000, 2]), torch.Size([1000, 1]))

model = KAN(width=[2,5,1], grid=5, k=3, seed=1, device=device)
model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01);
model.plot()

Parameter 1 : $\lambda$, overall penalty strength

Parameter 2 : (relative) penalty strength of entropy $\lambda_{ent}$

${\lambda}{\lambda_{ent}$에서 $ {\lambda_{ent} = 2$가 초기값으로 설정되에 있다.

# train the model
model = KAN(width=[2,5,1], grid=5, k=3, seed=1, device=device)
model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, lamb_entropy=0.0);
model.plot()

Parameter 3 : seed

API 7 : Pruning

가지치기는 (1) automatic pruning과 (2) manual pruning, 두 가지 방법이 있다.

from kan import *

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

# create a KAN: 2D inputs, 1D output, and 5 hidden neurons. cubic spline (k=3), 5 grid intervals (grid=5).
model = KAN(width=[2,5,1], grid=5, k=3, seed=1, device=device)

# create dataset f(x,y) = exp(sin(pi*x)+y^2)
f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
dataset = create_dataset(f, n_var=2, device=device)
dataset['train_input'].shape, dataset['train_label'].shape

# train the model
model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01);
model(dataset['train_input'])
model.plot()

Pruning nodes

mode = 'auto'

if mode == 'auto':
    # automatic
    model = model.prune_node(threshold=1e-2) # by default the threshold is 1e-2
    model.plot()
elif mode == 'manual':
    # manual
    model = model.prune_node(active_neurons_id=[[0]])

Pruning Edges

model.prune_edge()
model.plot()

Pruning nodes and edges together - just use model.prune()

API 8 : Regularization

from kan import *
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
dataset = create_dataset(f, n_var=2, device=device)

L1 regularization을 어떤 텐서에 적용할까? 현재 reg_metric에 대해 다섯가지 옵션을 지원한다.

edge_forward_spline_n : 엣지의 norm을 정규화된 형태(출력 std/입력 std)로 스플라인만 고려하여 계산(심볼릭 무시)
edge_forward_sum : 엣지의 norm을 정규화된 형태(출력 std/입력 std) 로 스플라인과 심복릭을 모두 포함하여 계산
edge_forward_spline_u : 엣지의 norm을 비정규화된 형태 (출력 std) 로 스플라인만 고려하여 계산(심볼릭 무시)
edge_backward : 엣지 기여도 점수
node_backward : 노드 기여도 점수

# train the model
model = KAN(width=[2,5,1], grid=3, k=3, seed=1, device=device)
model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, reg_metric='edge_forward_spline_n'); # default
#model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, reg_metric='edge_forward_sum');
#model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, reg_metric='edge_forward_spline_u');
#model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, reg_metric='edge_backward');
#model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, reg_metric='node_backward');
model.plot()

KAN diagram을 그릴 때에도 3가지 옵션이 있다.

forward_u = edge_forward_spline_u
forward_n = edge_forward_spline_n
backward = edge_backward

model.plot(metric='forward_u')
#model.plot(metric='forward_n')
#model.plot(metric='backward') # default

API 9 : Videos

만약 KAN 플롯의 학습 동작을 저장하고 싶다면, train() 메소드에 save_video = True로 설정하고, 일부 비디오 관련 파라미터를 설정하면 된다.

from kan import *
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

# create a KAN: 2D inputs, 1D output, and 5 hidden neurons. cubic spline (k=3), 5 grid intervals (grid=5).
model = KAN(width=[4,2,1,1], grid=3, k=3, seed=1, device=device)
f = lambda x: torch.exp((torch.sin(torch.pi*(x[:,[0]]**2+x[:,[1]]**2))+torch.sin(torch.pi*(x[:,[2]]**2+x[:,[3]]**2)))/2)
dataset = create_dataset(f, n_var=4, train_num=3000, device=device)

image_folder = 'video_img'

# train the model
#model.train(dataset, opt="LBFGS", steps=20, lamb=1e-3, lamb_entropy=2.);
model.fit(dataset, opt="LBFGS", steps=5, lamb=0.001, lamb_entropy=2., save_fig=True, beta=10,
            in_vars=[r'$x_1$', r'$x_2$', r'$x_3$', r'$x_4$'],
            out_vars=[r'${\rm exp}({\rm sin}(x_1^2+x_2^2)+{\rm sin}(x_3^2+x_4^2))$'],
            img_folder=image_folder);

import os
import numpy as np
import moviepy.video.io.ImageSequenceClip # moviepy == 1.0.3

video_name='video'
fps=5

fps = fps
files = os.listdir(image_folder)
train_index = []
for file in files:
    if file[0].isdigit() and file.endswith('.jpg'):
        train_index.append(int(file[:-4]))

train_index = np.sort(train_index)

image_files = [image_folder+'/'+str(train_index[index])+'.jpg' for index in train_index]

clip = moviepy.video.io.ImageSequenceClip.ImageSequenceClip(image_files, fps=fps)
clip.write_videofile(video_name+'.mp4')

API 10 : Device

from kan import KAN, create_dataset
import torch


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

#device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device = 'cpu'
print(device)

model = KAN(width=[4,100,100,100,1], grid=3, k=3, seed=0).to(device)
f = lambda x: torch.exp((torch.sin(torch.pi*(x[:,[0]]**2+x[:,[1]]**2))+torch.sin(torch.pi*(x[:,[2]]**2+x[:,[3]]**2)))/2)
dataset = create_dataset(f, n_var=4, train_num=1000, device=device)

# train the model
#model.train(dataset, opt="LBFGS", steps=20, lamb=1e-3, lamb_entropy=2.);
model.fit(dataset, opt="Adam", lr=1e-3, steps=50, lamb=1e-3, lamb_entropy=5., update_grid=False);

model = KAN(width=[4,100,100,100,1], grid=3, k=3, seed=0).to(device)
f = lambda x: torch.exp((torch.sin(torch.pi*(x[:,[0]]**2+x[:,[1]]**2))+torch.sin(torch.pi*(x[:,[2]]**2+x[:,[3]]**2)))/2)
dataset = create_dataset(f, n_var=4, train_num=1000, device=device)

# train the model
#model.train(dataset, opt="LBFGS", steps=20, lamb=1e-3, lamb_entropy=2.);
model.fit(dataset, opt="Adam", lr=1e-3, steps=50, lamb=1e-3, lamb_entropy=5., update_grid=False);

API 11 : Create dataset

how to use create_dataset in kan.utils

# 정석 방법

from kan.utils import create_dataset
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

f = lambda x: x[:,[0]] * x[:,[1]]
dataset = create_dataset(f, n_var=2, device=device)
dataset['train_label'].shape

x[:, [0]] 대신 x[:, 0]와 같이 작성하는 등의 오류를 범할 경우를 방지하기 위해 아래와 같이 간소하게도 데이터 생성이 가능하다.

f = lambda x: x[:,0] * x[:,1]
dataset = create_dataset(f, n_var=2, device=device)
dataset['train_label'].shape

혹은 아래와 같이도 가능하다.(f_mode = 'row'를 꼭 추가해야 함!)

f = lambda x: x[0] * x[1]
dataset = create_dataset(f, n_var=2, f_mode='row', device=device)
dataset['train_label'].shape

만약 입력 $x$와 출력 $y$를 가지고 있고, 이를 단순히 학습/테스트 세트로 분할하기만을 원한다면 create_dataset_from_data를 사용하면 된다.

import torch
from kan.utils import create_dataset_from_data

x = torch.rand(100,2)
y = torch.rand(100,1)
dataset = create_dataset_from_data(x, y, device=device)

API 12 : Checkpoint, save & load model

KAN(모델)이 변경될 떄마다(예 : fit, prune 등) 새로운 버전의 model.ckpt 폴더(기본값은 'model')에 저장된다. 버전 번호는 'a.b형식'으로, 여기서 a는 라운드 번호(0부터 시작하며 model.rewind()가 호출될 때 +1 증가)이고, b는 각 라운드 내에서의 버전 번호를 나타낸다. 초기화된 모델은 버전 0.0을 가지고 있다.

from kan import *
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
dataset = create_dataset(f, n_var=2, device=device)
model = KAN(width=[2,5,1], grid=5, k=3, seed=1, auto_save=True, device=device)
model.get_act(dataset)
model.plot()
>> 
cpu
checkpoint directory created: ./model
saving model version 0.0

model.auto_save
>>
True

model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01);
model.plot()
>>
saving model version 0.1

model = model.prune()
model.plot()
>>
saving model version 0.2

만약 버전 0.1로 돌아가고 싶다면, model = model.rewind('0.1')을 사용하면 된다. 이렇게 하면 새로운 라운드가 시작되며, 버전 0.1은 버전 1.1로 이름이 변경된다.

# revert to version 0.1 (if continuing)
model = model.rewind('0.1')

# revert to version 0.1 (if starting from scratch)
#model = KAN.loadckpt('./model' + '0.1')
#model.get_act(dataset)

model.plot()
>>
rewind to model version 0.1, renamed as 1.1

버전 1.1에서 추가적인 조작을 수행하면, 버전 1.2로 진행된다.

model.fit(dataset, opt="LBFGS", steps=2);
model.plot()
>>
saving model version 1.2

저작자표시 (새창열림)

ABOUT ME

공부하는 사람의 공부.c 공부하는 사람의 공부.c