Using Kur for Real¶
Let’s get started using Kur! There’s two topics we need to cover: describing models in a specification file, and actually running Kur.
Note
If you haven’t already, make sure you work through the Examples: In Depth first. Otherwise, this might feel a little overwhelming, and Kur shouldn’t make you feel like that.
Running Kur¶
Kur is pretty easy to run. It looks like this:
# usage:
kur [-v] {train | test | evaluate | build | dump | data} KURFILE.yml
# train the Kur MNIST example
kur train speech.yml
# test the Kur speech example while showing debug output
kur -vv test speech.yml
You feed Kur a Kurfile, and tell it what to do with the file (e.g. kur train
mnist.yml, which is the first example in Examples: In Depth ). Train,
test, and evaluate will be explained in detail in the next section on Kurfile
specification. The other commands are:
buildis kind of liketrain, but does not load the data set, and doesn’t actually start training. Instead,buildjust assembles the model. This is useful for debugging the construction of models, looking for obvious problems, while not bothering with loading and training on data just yet.dumpdisplays a JSON representation of the parsed Kurfile. It doesn’t do anything else, and is primarily useful as a debugging tool to see what information Kur is using. If you are using complicated Jinja2 syntax, including multiple Kurfiles, or want to verify YAML anchors, then this is an invaluable debugging command.datawill _not_ train/test/evaluate a model, but it will do _almost_ everything else. Instead of actually using the model, it will instead print out the data that it _would_ have fed into the very first batch. This is useful for checking that data is flowing in the way you expect through Kur.
Note: Keep in mind the optional flags -v and -vv. If you are
curious about the lower-level details of what Kur is doing (like the nuts and
bolts of how the network is being buit), then feel free to add -v to enable
INFO-level output, or even -vv to enable DEBUG-level ouput (there’s a lot).
By default, Kur tries not to be overly verbose in terminal output, so it only prints progress indicators, warnings and errors to the console while running. You should definitely pay attention to any warnings or errors: they are indicative of something unexpected or wrong. If Kur seems “stuck,” try enabling verbose output to see what it is up to.
Note
One other tip: the train/test/evaluate commands can all take a --step
flag. This sets a “breakpoint” just before data is given to the model at
each batch, requiring you to press ENTER to continue. Moreover, if
DEBUG-level output is enabled (-vv), then the entire batch of data will
be printed to the screen, and the model predictions will be printed
immediately after the batch is processed. This is primarily useful to Kur
developers who want to inspect the data that is being passed into the
model.
Kurfile Template & Info¶
Kur uses Kurfiles. These are “specification” files which describe the model, hyperparameters, data sets, training/evaluation options, and functional settings. This doc gives a quick, whirlwind tour of how Kur interprets the specification files. Kurfiles can be written in YAML or JSON.
For more details, see Kurfile Specification.
YAML Notes¶
Since YAML is the default supported Kur format, here are a couple pointers.
- YAML documents need to start with three dashes:
---. Everything you add to the file should go below those dashes. - Documents can be explicitly terminated by three periods:
..., but this is optional. - YAML is a “whitespace matters” language. You should never use tabs in YAML.
- YAML files should use the
.yamlor.ymlextension. - YAML comments start with the hash character:
#
For details of the YAML syntax, take a look at the Ansible overview
Skeleton Outline of a Simple Kurfile¶
This is a simplied template you can use for writing Kur models.
---
# Other kurfiles to load (optional)
include:
# Global variables go here (optional)
settings:
# Your core model goes here (required)
model:
# Input data
- input: INPUT
# ... other layers ...
# Last layer. Change "softmax" if it is appropriate.
- activation: softmax
name: OUTPUT
# All the information you need for training.
train:
# Where to get training data from.
# NOTE: `TRAIN_DATA` needs to have dictionary keys named `INPUT` and
# `OUTPUT`, corresponding to the `INPUT` and `OUTPUT` names in the
# model section above.
data:
- pickle: TRAIN_DATA
# Try playing with the batch size and watching accuracy and speed.
provider:
batch_size: 32
# How many epochs to train for.
epochs: 10
# Where to load and save weights.
weights:
initial: INITIAL_WEIGHTS
best: BEST_TRAINING_LOSS_WEIGHTS
last: MOST_RECENT_WEIGHTS
# The optimizer to use. Try doubling or halving the learning rate.
optimizer:
name: adam
learning_rate: 0.001
# You need this section if you want to run validation checks during
# training.
validate:
data:
- pickle: VALIDATION_DATA
# Where to save the best validation weights.
weights: BEST_VALIDATION_LOSS_WEIGHTS
# You need this section only if you want to run standalone test runs to
# calculate loss.
test:
data:
- pickle: TEST_DATA
# Which weights to use for testing.
weights: BEST_VALIDATION_LOSS_WEIGHTS
# This section is for trying out your model on new data.
evaluate:
# The data to supply as input. Unlike the train/validate/test sections,
# you do not need a corresponding `OUTPUT` key. But if you do supply one,
# Kur can save it to the output file for you so it's easy to use during
# post-processing
data:
- pickle: NEW_DATA
# Which weights to use for evaluation.
weights: BEST_VALIDATION_LOSS_WEIGHTS
# Where to save the result (as a Python pickle)
destination: RESULTS.pkl
# Required for training, validation and testing
loss:
# You need an entry whose target is `OUTPUT` from the model section above.
- target: OUTPUT
# The name of the loss function. Change it if appropriate
name: categorical_crossentropy
...
We’re going to cover the simplest details of these sections.
include: You only need this if you’ve split your specification into multiple files. Otherwise, you can leave it empty or just remove it.settings: This is the place that you can set global variables that you want to reference using the templating engine later (e.g., data sets or model hyperparameters). If you don’t have any variables, you can just leave this section empty or remote it.model: This is the fun part! Make sure you have aninputentry, and a give the final layer a name, too (it’s your output). The names need to correspond to the data that gets loaded during training, evaluation, etc. For a full list of “containers” (that’s what Kur calls each entry in the model section), see Containers: Layers & Operators. The Examples: In Depth are also a good place to start.train: Everything you want to tell Kur about the desired training process.- The
datasection just tells Kur to load a pickled Python file calledTRAIN_DATA. That file should be a Python dictionary with keys corresponding to the input/output names you chose in themodelsection. The values in that dictionary should be numpy arrays that you want to feed into the Kur model. - The
batch_sizecan be used to change how many training samples Kur uses at each step in the training process. epochstells Kur how many iterations of the entire training set it should run through before stopping.- The
weightssection tells Kur where it should save the state of the model (the model weights or parameters). This section tells Kur to load any existing weights from theinitialfile; these weights might exist because you’ve already trained the model a few times and now you want to train some more, picking up where you left off. If thisinitialfile doesn’t exist, Kur just assumes it’s your first time through the training process and chugs along merrily. Thebestfile tells Kur where to save the weights if they produce the lowest loss (with respect to the training data) that Kur has seen yet. Thelastfile is where Kur saves the weights before it stops training. - The
optimizeris where you tell Kur which algorithm it should use to try and improve the model performance / minimize loss.
- The
validate: It’s usually a good idea to have a validation set that you can use to independently assess how the model is performing. This is the place for it! It accepts adatasection just liketrain, and theweightstell Kur where to save the weights whenever they produce the lowest historical loss with respect to the validation data.test: If you have a test set, put itsdataspecification here. Theweightsfield tells Kur which weights you want it to load first.evaluate: This is where you put information about new data sets you want to apply your model to. And you guessed it—thedatasection is just like all the others. The difference is that the pickled data dictionary not longer required a key corresponding to the model output from themodelsection; but if you give Kur the true output data anyway, it can use it for additional statistics and save it to the output for for you. Theweightsfield tells Kur which weights to load before evaluating.destinationnames the output file where Kur should save the model results. It will save them as a Python pickle.loss: Every model output needs a corresponding loss function. Make sure you have atargetfor each model output (it should have the same name, too, just like the data files).