High level API

ONNXRunTime.InferenceSessionType
(o::InferenceSession)(inputs [,output_names])

Run an InferenceSession on a collection of inputs. Here inputs can either be a NamedTuple or an AbstractDict. Optionally output_names can be passed. In this case only the outputs whose names are contained in output_names are computed.

source
ONNXRunTime.load_inferenceMethod
load_inference(
    path::AbstractString;
    execution_provider,
    envname,
    logging_level,
    provider_options
) -> InferenceSession

Load an ONNX file at path into an inference session.

Keyword arguments:

  • execution_provider: Either :cpu or :cuda. The latter requires a CUDA capable GPU and the CUDA and cuDNN packages must first be imported.
  • envname: Name used for logging purposes.
  • logging_level: Level of diagnostic output. Options are :verbose, :info, :warning (default), :error, and :fatal.
  • provider_options: Named tuple with options passed to the execution provider.

Note: Due to limitations of the C API CreateEnv function, envname and logging_level can only be set once per process. Attempts to change these are ignored.

source
ONNXRunTime.releaseMethod
release(o::InferenceSession)::Nothing

Release memory allocated to an InferenceSession. This also happens automatically when the object has gone out of scope and the garbage collector deletes it.

However, there is no guarantee when that happens, so it can be useful to manually release the memory. This is especially true when the model has allocated GPU memory, which does not put pressure on the garbage collector to run promptly.

Using the inference session after releasing is an error.

source

Low level API

ONNXRunTime.CAPI.ONNXTensorElementDataTypeType
ONNXRunTime.CAPI.ONNXTensorElementDataType

CEnum with possible values:

  • ONNXTENSORELEMENTDATATYPE_UNDEFINED
  • ONNXTENSORELEMENTDATATYPE_FLOAT
  • ONNXTENSORELEMENTDATATYPE_UINT8
  • ONNXTENSORELEMENTDATATYPE_INT8
  • ONNXTENSORELEMENTDATATYPE_UINT16
  • ONNXTENSORELEMENTDATATYPE_INT16
  • ONNXTENSORELEMENTDATATYPE_INT32
  • ONNXTENSORELEMENTDATATYPE_INT64
  • ONNXTENSORELEMENTDATATYPE_STRING
  • ONNXTENSORELEMENTDATATYPE_BOOL
  • ONNXTENSORELEMENTDATATYPE_FLOAT16
  • ONNXTENSORELEMENTDATATYPE_DOUBLE
  • ONNXTENSORELEMENTDATATYPE_UINT32
  • ONNXTENSORELEMENTDATATYPE_UINT64
  • ONNXTENSORELEMENTDATATYPE_COMPLEX64
  • ONNXTENSORELEMENTDATATYPE_COMPLEX128
  • ONNXTENSORELEMENTDATATYPE_BFLOAT16
source
ONNXRunTime.CAPI.OrtCUDAProviderOptionsMethod
OrtCUDAProviderOptions(
;
    device_id,
    cudnn_conv_algo_search,
    gpu_mem_limit,
    arena_extend_strategy,
    do_copy_in_default_stream,
    has_user_compute_stream,
    user_compute_stream,
    default_memory_arena_cfg
) -> ONNXRunTime.CAPI.OrtCUDAProviderOptions
source
ONNXRunTime.CAPI.OrtLoggingLevelType
ONNXRunTime.CAPI.OrtLoggingLevel

CEnum with possible values:

  • ORTLOGGINGLEVEL_VERBOSE
  • ORTLOGGINGLEVEL_INFO
  • ORTLOGGINGLEVEL_WARNING
  • ORTLOGGINGLEVEL_ERROR
  • ORTLOGGINGLEVEL_FATAL
source
ONNXRunTime.CAPI.OrtMemTypeType
ONNXRunTime.CAPI.OrtMemType

CEnum with possible values:

  • OrtMemTypeCPUInput
  • OrtMemTypeCPUOutput
  • OrtMemTypeCPUOutput
  • OrtMemTypeDefault
source
ONNXRunTime.CAPI.CreateAllocatorMethod
CreateAllocator(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession,
    meminfo::ONNXRunTime.CAPI.OrtMemoryInfo
) -> ONNXRunTime.CAPI.OrtAllocator
source
ONNXRunTime.CAPI.CreateSessionMethod
CreateSession(
    api::ONNXRunTime.CAPI.OrtApi,
    env::ONNXRunTime.CAPI.OrtEnv,
    model_path::AbstractString,
    options::ONNXRunTime.CAPI.OrtSessionOptions
) -> ONNXRunTime.CAPI.OrtSession
source
ONNXRunTime.CAPI.CreateTensorWithDataAsOrtValueMethod
CreateTensorWithDataAsOrtValue(
    api::ONNXRunTime.CAPI.OrtApi,
    memory_info::ONNXRunTime.CAPI.OrtMemoryInfo,
    data::Vector,
    shape
) -> ONNXRunTime.CAPI.OrtValue

Return a tensor with shape shape that is backed by the memory of data.

source
ONNXRunTime.CAPI.GetApiFunction
GetApi(
    api_base::ONNXRunTime.CAPI.OrtApiBase
) -> ONNXRunTime.CAPI.OrtApi
GetApi(
    api_base::ONNXRunTime.CAPI.OrtApiBase,
    ort_api_version::Integer
) -> ONNXRunTime.CAPI.OrtApi
source
ONNXRunTime.CAPI.GetDimensionsFunction
GetDimensions(
    api::ONNXRunTime.CAPI.OrtApi,
    o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
) -> Vector{Int64}
GetDimensions(
    api::ONNXRunTime.CAPI.OrtApi,
    o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo,
    ndims
) -> Vector{Int64}
source
ONNXRunTime.CAPI.GetTensorMutableDataMethod
GetTensorMutableData(
    api::ONNXRunTime.CAPI.OrtApi,
    tensor::ONNXRunTime.CAPI.OrtValue
) -> PermutedDimsArray{T, N, perm, iperm, Array{T1, N1}} where {T, N, perm, iperm, T1, N1}
source
ONNXRunTime.CAPI.RunMethod
Run(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession,
    run_options::Union{Nothing, ONNXRunTime.CAPI.OrtRunOptions},
    input_names::Vector{String},
    inputs::Vector{ONNXRunTime.CAPI.OrtValue},
    output_names::Vector{String}
) -> Vector{ONNXRunTime.CAPI.OrtValue}
source
ONNXRunTime.CAPI.SessionGetInputNameMethod
SessionGetInputName(
    api::ONNXRunTime.CAPI.OrtApi,
    session::ONNXRunTime.CAPI.OrtSession,
    index::Integer,
    allocator::ONNXRunTime.CAPI.OrtAllocator
) -> String
source
ONNXRunTime.CAPI.into_juliaMethod
into_julia(
    _::Type{T},
    api::ONNXRunTime.CAPI.OrtApi,
    objptr::Ref{Ptr{Nothing}},
    status_ptr::Ptr{Nothing},
    gchandles
) -> Any

Create a julia object from the output of an api call. Check and release status_ptr.

source
ONNXRunTime.CAPI.releaseFunction
release(api::OrtApi, obj)::Nothing

Release memory owned by obj. The garbage collector should call this function automatically. If it does not that's a bug that should be reported.

There might however be situations with high memory pressure. In these situations it might help to call this function manually to release memory earlier. Using an object after releasing it is undefined behaviour.

source
ONNXRunTime.CAPI.unsafe_GetTensorMutableDataMethod
unsafe_GetTensorMutableData(
    api::ONNXRunTime.CAPI.OrtApi,
    tensor::ONNXRunTime.CAPI.OrtValue
) -> PermutedDimsArray{T, N, perm, iperm, Array{T1, N1}} where {T, N, perm, iperm, T1, N1}

This function is unsafe, because its output points to memory owned by tensor. After tensor is released, accessing the output becomes undefined behaviour.

source