High level API
ONNXRunTime.InferenceSession — Type(o::InferenceSession)(inputs [,output_names])Run an InferenceSession on a collection of inputs. Here inputs can either be a NamedTuple or an AbstractDict. Optionally output_names can be passed. In this case only the outputs whose names are contained in output_names are computed.
ONNXRunTime.InferenceSession — Typestruct InferenceSessionRepresents an inference session. Should only be created by calling load_inference.
ONNXRunTime.load_inference — Methodload_inference(
    path::AbstractString;
    execution_provider,
    envname,
    logging_level,
    provider_options
) -> InferenceSession
Load an ONNX file at path into an inference session.
Keyword arguments:
- execution_provider: Either- :cpuor- :cuda. The latter requires a CUDA capable GPU and the- CUDAand- cuDNNpackages must first be imported.
- envname: Name used for logging purposes.
- logging_level: Level of diagnostic output. Options are- :verbose,- :info,- :warning(default),- :error, and- :fatal.
- provider_options: Named tuple with options passed to the execution provider.
Note: Due to limitations of the C API CreateEnv function, envname and logging_level can only be set once per process. Attempts to change these are ignored.
ONNXRunTime.release — Methodrelease(o::InferenceSession)::NothingRelease memory allocated to an InferenceSession. This also happens automatically when the object has gone out of scope and the garbage collector deletes it.
However, there is no guarantee when that happens, so it can be useful to manually release the memory. This is especially true when the model has allocated GPU memory, which does not put pressure on the garbage collector to run promptly.
Using the inference session after releasing is an error.
Low level API
ONNXRunTime.CAPI — Modulemodule CAPI
This module closely follows the offical onnxruntime C-API. See here for a C code example.
ONNXRunTime.CAPI.ONNXTensorElementDataType — TypeONNXRunTime.CAPI.ONNXTensorElementDataTypeCEnum with possible values:
- ONNXTENSORELEMENTDATATYPE_UNDEFINED
- ONNXTENSORELEMENTDATATYPE_FLOAT
- ONNXTENSORELEMENTDATATYPE_UINT8
- ONNXTENSORELEMENTDATATYPE_INT8
- ONNXTENSORELEMENTDATATYPE_UINT16
- ONNXTENSORELEMENTDATATYPE_INT16
- ONNXTENSORELEMENTDATATYPE_INT32
- ONNXTENSORELEMENTDATATYPE_INT64
- ONNXTENSORELEMENTDATATYPE_STRING
- ONNXTENSORELEMENTDATATYPE_BOOL
- ONNXTENSORELEMENTDATATYPE_FLOAT16
- ONNXTENSORELEMENTDATATYPE_DOUBLE
- ONNXTENSORELEMENTDATATYPE_UINT32
- ONNXTENSORELEMENTDATATYPE_UINT64
- ONNXTENSORELEMENTDATATYPE_COMPLEX64
- ONNXTENSORELEMENTDATATYPE_COMPLEX128
- ONNXTENSORELEMENTDATATYPE_BFLOAT16
ONNXRunTime.CAPI.OrtAllocator — TypeONNXRunTime.CAPI.OrtAllocatorWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtAllocator.
ONNXRunTime.CAPI.OrtAllocatorType — TypeONNXRunTime.CAPI.OrtAllocatorTypeCEnum with possible values:
- Invalid
- OrtDeviceAllocator
- OrtArenaAllocator
ONNXRunTime.CAPI.OrtApi — Typestruct OrtApiONNXRunTime.CAPI.OrtApiBase — Typestruct OrtApiBaseONNXRunTime.CAPI.OrtArenaCfg — TypeONNXRunTime.CAPI.OrtArenaCfgWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtArenaCfg.
ONNXRunTime.CAPI.OrtCUDAProviderOptions — Typestruct OrtCUDAProviderOptionsONNXRunTime.CAPI.OrtCUDAProviderOptions — MethodOrtCUDAProviderOptions(
;
    device_id,
    cudnn_conv_algo_search,
    gpu_mem_limit,
    arena_extend_strategy,
    do_copy_in_default_stream,
    has_user_compute_stream,
    user_compute_stream,
    default_memory_arena_cfg
) -> ONNXRunTime.CAPI.OrtCUDAProviderOptions
ONNXRunTime.CAPI.OrtCudnnConvAlgoSearch — TypeONNXRunTime.CAPI.OrtCudnnConvAlgoSearchCEnum with possible values:
- EXHAUSTIVE
- HEURISTIC
- DEFAULT
ONNXRunTime.CAPI.OrtCustomOpDomain — TypeONNXRunTime.CAPI.OrtCustomOpDomainWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtCustomOpDomain.
ONNXRunTime.CAPI.OrtEnv — TypeONNXRunTime.CAPI.OrtEnvWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtEnv.
ONNXRunTime.CAPI.OrtIoBinding — TypeONNXRunTime.CAPI.OrtIoBindingWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtIoBinding.
ONNXRunTime.CAPI.OrtLoggingLevel — TypeONNXRunTime.CAPI.OrtLoggingLevelCEnum with possible values:
- ORTLOGGINGLEVEL_VERBOSE
- ORTLOGGINGLEVEL_INFO
- ORTLOGGINGLEVEL_WARNING
- ORTLOGGINGLEVEL_ERROR
- ORTLOGGINGLEVEL_FATAL
ONNXRunTime.CAPI.OrtMapTypeInfo — TypeONNXRunTime.CAPI.OrtMapTypeInfoWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtMapTypeInfo.
ONNXRunTime.CAPI.OrtMemType — TypeONNXRunTime.CAPI.OrtMemTypeCEnum with possible values:
- OrtMemTypeCPUInput
- OrtMemTypeCPUOutput
- OrtMemTypeCPUOutput
- OrtMemTypeDefault
ONNXRunTime.CAPI.OrtMemoryInfo — TypeONNXRunTime.CAPI.OrtMemoryInfoWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtMemoryInfo.
ONNXRunTime.CAPI.OrtModelMetadata — TypeONNXRunTime.CAPI.OrtModelMetadataWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtModelMetadata.
ONNXRunTime.CAPI.OrtPrepackedWeightsContainer — TypeONNXRunTime.CAPI.OrtPrepackedWeightsContainerWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtPrepackedWeightsContainer.
ONNXRunTime.CAPI.OrtRunOptions — TypeONNXRunTime.CAPI.OrtRunOptionsWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtRunOptions.
ONNXRunTime.CAPI.OrtSequenceTypeInfo — TypeONNXRunTime.CAPI.OrtSequenceTypeInfoWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtSequenceTypeInfo.
ONNXRunTime.CAPI.OrtSession — TypeONNXRunTime.CAPI.OrtSessionWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtSession.
ONNXRunTime.CAPI.OrtSessionOptions — TypeONNXRunTime.CAPI.OrtSessionOptionsWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtSessionOptions.
ONNXRunTime.CAPI.OrtStatus — TypeONNXRunTime.CAPI.OrtStatusWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtStatus.
ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo — TypeONNXRunTime.CAPI.OrtTensorTypeAndShapeInfoWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo.
ONNXRunTime.CAPI.OrtThreadingOptions — TypeONNXRunTime.CAPI.OrtThreadingOptionsWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtThreadingOptions.
ONNXRunTime.CAPI.OrtTypeInfo — TypeONNXRunTime.CAPI.OrtTypeInfoWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtTypeInfo.
ONNXRunTime.CAPI.OrtValue — TypeONNXRunTime.CAPI.OrtValueWraps a pointer to the C object of type ONNXRunTime.CAPI.OrtValue.
ONNXRunTime.CAPI.AllocatorFree — MethodAllocatorFree(
    api::ONNXRunTime.CAPI.OrtApi,
    allocator::ONNXRunTime.CAPI.OrtAllocator,
    ptr::Ptr
)
ONNXRunTime.CAPI.CreateAllocator — MethodCreateAllocator(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession,
    meminfo::ONNXRunTime.CAPI.OrtMemoryInfo
) -> ONNXRunTime.CAPI.OrtAllocator
ONNXRunTime.CAPI.CreateArenaCfgV2 — MethodCreateArenaCfgV2(
    api::ONNXRunTime.CAPI.OrtApi,
    keys,
    vals
) -> ONNXRunTime.CAPI.OrtArenaCfg
ONNXRunTime.CAPI.CreateCpuMemoryInfo — MethodCreateCpuMemoryInfo(
    api::ONNXRunTime.CAPI.OrtApi;
    allocator_type,
    mem_type
) -> ONNXRunTime.CAPI.OrtMemoryInfo
ONNXRunTime.CAPI.CreateEnv — MethodCreateEnv(api::ONNXRunTime.CAPI.OrtApi; logging_level, name)
ONNXRunTime.CAPI.CreateRunOptions — MethodCreateRunOptions(
    api::ONNXRunTime.CAPI.OrtApi
) -> ONNXRunTime.CAPI.OrtRunOptions
ONNXRunTime.CAPI.CreateSession — MethodCreateSession(
    api::ONNXRunTime.CAPI.OrtApi,
    env::ONNXRunTime.CAPI.OrtEnv,
    model_path::AbstractString,
    options::ONNXRunTime.CAPI.OrtSessionOptions
) -> ONNXRunTime.CAPI.OrtSession
ONNXRunTime.CAPI.CreateSessionOptions — MethodCreateSessionOptions(
    api::ONNXRunTime.CAPI.OrtApi
) -> ONNXRunTime.CAPI.OrtSessionOptions
ONNXRunTime.CAPI.CreateTensorWithDataAsOrtValue — MethodCreateTensorWithDataAsOrtValue(
    api::ONNXRunTime.CAPI.OrtApi,
    memory_info::ONNXRunTime.CAPI.OrtMemoryInfo,
    data::Vector,
    shape
) -> ONNXRunTime.CAPI.OrtValue
Return a tensor with shape shape that is backed by the memory of data.
ONNXRunTime.CAPI.Free — MethodFree(
    alloc::ONNXRunTime.CAPI.OrtAllocator,
    ptr::Union{Cstring, Ptr}
)
ONNXRunTime.CAPI.GetApi — FunctionGetApi(
    api_base::ONNXRunTime.CAPI.OrtApiBase
) -> ONNXRunTime.CAPI.OrtApi
GetApi(
    api_base::ONNXRunTime.CAPI.OrtApiBase,
    ort_api_version::Integer
) -> ONNXRunTime.CAPI.OrtApi
ONNXRunTime.CAPI.GetDimensions — FunctionGetDimensions(
    api::ONNXRunTime.CAPI.OrtApi,
    o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
) -> Vector{Int64}
GetDimensions(
    api::ONNXRunTime.CAPI.OrtApi,
    o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo,
    ndims
) -> Vector{Int64}
ONNXRunTime.CAPI.GetDimensionsCount — MethodGetDimensionsCount(
    api::ONNXRunTime.CAPI.OrtApi,
    o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
) -> UInt64
ONNXRunTime.CAPI.GetErrorMessage — MethodGetErrorMessage(
    api::ONNXRunTime.CAPI.OrtApi,
    status::Ptr{Nothing}
) -> String
ONNXRunTime.CAPI.GetTensorElementType — MethodGetTensorElementType(
    api::ONNXRunTime.CAPI.OrtApi,
    o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
) -> ONNXRunTime.CAPI.ONNXTensorElementDataType
ONNXRunTime.CAPI.GetTensorMutableData — MethodGetTensorMutableData(
    api::ONNXRunTime.CAPI.OrtApi,
    tensor::ONNXRunTime.CAPI.OrtValue
) -> PermutedDimsArray{T, N, perm, iperm, Array{T1, N1}} where {T, N, perm, iperm, N1, T1}
ONNXRunTime.CAPI.GetTensorTypeAndShape — MethodGetTensorTypeAndShape(
    api::ONNXRunTime.CAPI.OrtApi,
    o::ONNXRunTime.CAPI.OrtValue
) -> ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
ONNXRunTime.CAPI.GetVersionString — MethodGetVersionString(
    api_base::ONNXRunTime.CAPI.OrtApiBase
) -> String
ONNXRunTime.CAPI.IsTensor — MethodIsTensor(
    api::ONNXRunTime.CAPI.OrtApi,
    val::ONNXRunTime.CAPI.OrtValue
) -> Bool
ONNXRunTime.CAPI.OrtGetApiBase — MethodOrtGetApiBase(
;
    execution_provider
) -> ONNXRunTime.CAPI.OrtApiBase
ONNXRunTime.CAPI.Run — MethodRun(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession,
    run_options::Union{Nothing, ONNXRunTime.CAPI.OrtRunOptions},
    input_names::Vector{String},
    inputs::Vector{ONNXRunTime.CAPI.OrtValue},
    output_names::Vector{String}
) -> Vector{ONNXRunTime.CAPI.OrtValue}
ONNXRunTime.CAPI.SessionGetInputCount — MethodSessionGetInputCount(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession
) -> UInt64
ONNXRunTime.CAPI.SessionGetInputName — MethodSessionGetInputName(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession,
    index::Integer,
    allocator::ONNXRunTime.CAPI.OrtAllocator
) -> String
ONNXRunTime.CAPI.SessionGetModelMetadata — MethodSessionGetModelMetadata(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession
) -> ONNXRunTime.CAPI.OrtModelMetadata
ONNXRunTime.CAPI.SessionGetOutputCount — MethodSessionGetOutputCount(
    api::ONNXRunTime.CAPI.OrtApi,
    sess::ONNXRunTime.CAPI.OrtSession
) -> UInt64
ONNXRunTime.CAPI.SessionGetOutputName — MethodSessionGetOutputName(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession,
    index::Integer,
    allocator::ONNXRunTime.CAPI.OrtAllocator
) -> String
ONNXRunTime.CAPI.SessionOptionsAppendExecutionProvider_CUDA — MethodSessionOptionsAppendExecutionProvider_CUDA(
    api::ONNXRunTime.CAPI.OrtApi,
    session_options::ONNXRunTime.CAPI.OrtSessionOptions,
    cuda_options::ONNXRunTime.CAPI.OrtCUDAProviderOptions
)
ONNXRunTime.CAPI.into_julia — Methodinto_julia(
    _::Type{T},
    api::ONNXRunTime.CAPI.OrtApi,
    objptr::Ref{Ptr{Nothing}},
    status_ptr::Ptr{Nothing},
    gchandles
) -> Any
Create a julia object from the output of an api call. Check and release status_ptr.
ONNXRunTime.CAPI.juliatype — Methodjuliatype(
    onnx::ONNXRunTime.CAPI.ONNXTensorElementDataType
) -> Type
ONNXRunTime.CAPI.release — Functionrelease(api::OrtApi, obj)::NothingRelease memory owned by obj. The garbage collector should call this function automatically. If it does not that's a bug that should be reported.
There might however be situations with high memory pressure. In these situations it might help to call this function manually to release memory earlier. Using an object after releasing it is undefined behaviour.
ONNXRunTime.CAPI.unsafe_GetTensorMutableData — Methodunsafe_GetTensorMutableData(
    api::ONNXRunTime.CAPI.OrtApi,
    tensor::ONNXRunTime.CAPI.OrtValue
) -> PermutedDimsArray{T, N, perm, iperm, Array{T1, N1}} where {T, N, perm, iperm, N1, T1}
This function is unsafe, because its output points to memory owned by tensor. After tensor is released, accessing the output becomes undefined behaviour.