Path to the model on the filesystem.
Optional batchPrompt processing batch size.
Optional contextText context size.
Optional embeddingEmbedding mode only.
Optional f16Use fp16 for KV cache.
Optional gpuNumber of layers to store in VRAM.
Optional logitsThe llama_eval() call computes all logits, not just the last one.
Optional maxOptional prependAdd the begining of sentence token.
Optional seedIf null, a random seed will be used.
Optional temperatureThe randomness of the responses, e.g. 0.1 deterministic, 1.5 creative, 0.8 balanced, 0 disables.
Optional threadsNumber of threads to use to evaluate tokens.
Optional topKConsider the n most likely tokens, where n is 1 to vocabulary size, 0 disables (uses full vocabulary). Note: only applies when temperature > 0.
Optional topPSelects the smallest token set whose probability exceeds P, where P is between 0 - 1, 1 disables. Note: only applies when temperature > 0.
Optional trimTrim whitespace from the end of the generated text Disabled by default.
Optional useForce system to keep model in RAM.
Optional useUse mmap if possible.
Optional vocabOnly load the vocabulary, no weights.
Generated using TypeDoc
Note that the modelPath is the only required parameter. For testing you can set this in the environment variable
LLAMA_PATH.