High level API
ONNXRunTime.InferenceSession
— Type(o::InferenceSession)(inputs [,output_names])
Run an InferenceSession
on a collection of inputs. Here inputs
can either be a NamedTuple
or an AbstractDict
. Optionally output_names
can be passed. In this case only the outputs whose names are contained in output_names
are computed.
ONNXRunTime.InferenceSession
— Typestruct InferenceSession
Represents an infernence session. Should only be created by calling load_inference
.
ONNXRunTime.load_inference
— Methodload_inference(
path::AbstractString;
execution_provider,
envname,
logging_level,
provider_options
) -> InferenceSession
Load an ONNX file at path
into an inference session.
Keyword arguments:
execution_provider
: Either:cpu
or:cuda
. The latter requires a CUDA capable GPU and theCUDA
andcuDNN
packages must first be imported.envname
: Name used for logging purposes.logging_level
: Level of diagnostic output. Options are:verbose
,:info
,:warning
(default),:error
, and:fatal
.provider_options
: Named tuple with options passed to the execution provider.
Note: Due to limitations of the C API CreateEnv
function, envname
and logging_level
can only be set once per process. Attempts to change these are ignored.
ONNXRunTime.release
— Methodrelease(o::InferenceSession)::Nothing
Release memory allocated to an InferenceSession
. This also happens automatically when the object has gone out of scope and the garbage collector deletes it.
However, there is no guarantee when that happens, so it can be useful to manually release the memory. This is especially true when the model has allocated GPU memory, which does not put pressure on the garbage collector to run promptly.
Using the inference session after releasing is an error.
Low level API
ONNXRunTime.CAPI
— Modulemodule CAPI
This module closely follows the offical onnxruntime C-API. See here for a C code example.
ONNXRunTime.CAPI.ONNXTensorElementDataType
— TypeONNXRunTime.CAPI.ONNXTensorElementDataType
CEnum with possible values:
- ONNXTENSORELEMENTDATATYPE_UNDEFINED
- ONNXTENSORELEMENTDATATYPE_FLOAT
- ONNXTENSORELEMENTDATATYPE_UINT8
- ONNXTENSORELEMENTDATATYPE_INT8
- ONNXTENSORELEMENTDATATYPE_UINT16
- ONNXTENSORELEMENTDATATYPE_INT16
- ONNXTENSORELEMENTDATATYPE_INT32
- ONNXTENSORELEMENTDATATYPE_INT64
- ONNXTENSORELEMENTDATATYPE_STRING
- ONNXTENSORELEMENTDATATYPE_BOOL
- ONNXTENSORELEMENTDATATYPE_FLOAT16
- ONNXTENSORELEMENTDATATYPE_DOUBLE
- ONNXTENSORELEMENTDATATYPE_UINT32
- ONNXTENSORELEMENTDATATYPE_UINT64
- ONNXTENSORELEMENTDATATYPE_COMPLEX64
- ONNXTENSORELEMENTDATATYPE_COMPLEX128
- ONNXTENSORELEMENTDATATYPE_BFLOAT16
ONNXRunTime.CAPI.OrtAllocator
— TypeONNXRunTime.CAPI.OrtAllocator
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtAllocator.
ONNXRunTime.CAPI.OrtAllocatorType
— TypeONNXRunTime.CAPI.OrtAllocatorType
CEnum with possible values:
- Invalid
- OrtDeviceAllocator
- OrtArenaAllocator
ONNXRunTime.CAPI.OrtApi
— Typestruct OrtApi
ONNXRunTime.CAPI.OrtApiBase
— Typestruct OrtApiBase
ONNXRunTime.CAPI.OrtArenaCfg
— TypeONNXRunTime.CAPI.OrtArenaCfg
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtArenaCfg.
ONNXRunTime.CAPI.OrtCUDAProviderOptions
— Typestruct OrtCUDAProviderOptions
ONNXRunTime.CAPI.OrtCUDAProviderOptions
— MethodOrtCUDAProviderOptions(
;
device_id,
cudnn_conv_algo_search,
gpu_mem_limit,
arena_extend_strategy,
do_copy_in_default_stream,
has_user_compute_stream,
user_compute_stream,
default_memory_arena_cfg
) -> ONNXRunTime.CAPI.OrtCUDAProviderOptions
ONNXRunTime.CAPI.OrtCudnnConvAlgoSearch
— TypeONNXRunTime.CAPI.OrtCudnnConvAlgoSearch
CEnum with possible values:
- EXHAUSTIVE
- HEURISTIC
- DEFAULT
ONNXRunTime.CAPI.OrtCustomOpDomain
— TypeONNXRunTime.CAPI.OrtCustomOpDomain
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtCustomOpDomain.
ONNXRunTime.CAPI.OrtEnv
— TypeONNXRunTime.CAPI.OrtEnv
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtEnv.
ONNXRunTime.CAPI.OrtIoBinding
— TypeONNXRunTime.CAPI.OrtIoBinding
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtIoBinding.
ONNXRunTime.CAPI.OrtLoggingLevel
— TypeONNXRunTime.CAPI.OrtLoggingLevel
CEnum with possible values:
- ORTLOGGINGLEVEL_VERBOSE
- ORTLOGGINGLEVEL_INFO
- ORTLOGGINGLEVEL_WARNING
- ORTLOGGINGLEVEL_ERROR
- ORTLOGGINGLEVEL_FATAL
ONNXRunTime.CAPI.OrtMapTypeInfo
— TypeONNXRunTime.CAPI.OrtMapTypeInfo
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtMapTypeInfo.
ONNXRunTime.CAPI.OrtMemType
— TypeONNXRunTime.CAPI.OrtMemType
CEnum with possible values:
- OrtMemTypeCPUInput
- OrtMemTypeCPUOutput
- OrtMemTypeCPUOutput
- OrtMemTypeDefault
ONNXRunTime.CAPI.OrtMemoryInfo
— TypeONNXRunTime.CAPI.OrtMemoryInfo
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtMemoryInfo.
ONNXRunTime.CAPI.OrtModelMetadata
— TypeONNXRunTime.CAPI.OrtModelMetadata
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtModelMetadata.
ONNXRunTime.CAPI.OrtPrepackedWeightsContainer
— TypeONNXRunTime.CAPI.OrtPrepackedWeightsContainer
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtPrepackedWeightsContainer.
ONNXRunTime.CAPI.OrtRunOptions
— TypeONNXRunTime.CAPI.OrtRunOptions
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtRunOptions.
ONNXRunTime.CAPI.OrtSequenceTypeInfo
— TypeONNXRunTime.CAPI.OrtSequenceTypeInfo
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtSequenceTypeInfo.
ONNXRunTime.CAPI.OrtSession
— TypeONNXRunTime.CAPI.OrtSession
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtSession.
ONNXRunTime.CAPI.OrtSessionOptions
— TypeONNXRunTime.CAPI.OrtSessionOptions
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtSessionOptions.
ONNXRunTime.CAPI.OrtStatus
— TypeONNXRunTime.CAPI.OrtStatus
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtStatus.
ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
— TypeONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo.
ONNXRunTime.CAPI.OrtThreadingOptions
— TypeONNXRunTime.CAPI.OrtThreadingOptions
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtThreadingOptions.
ONNXRunTime.CAPI.OrtTypeInfo
— TypeONNXRunTime.CAPI.OrtTypeInfo
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtTypeInfo.
ONNXRunTime.CAPI.OrtValue
— TypeONNXRunTime.CAPI.OrtValue
Wraps a pointer to the C object of type ONNXRunTime.CAPI.OrtValue.
ONNXRunTime.CAPI.AllocatorFree
— MethodAllocatorFree(
api::ONNXRunTime.CAPI.OrtApi,
allocator::ONNXRunTime.CAPI.OrtAllocator,
ptr::Ptr
)
ONNXRunTime.CAPI.CreateAllocator
— MethodCreateAllocator(
api::ONNXRunTime.CAPI.OrtApi,
session::ONNXRunTime.CAPI.OrtSession,
meminfo::ONNXRunTime.CAPI.OrtMemoryInfo
) -> ONNXRunTime.CAPI.OrtAllocator
ONNXRunTime.CAPI.CreateArenaCfgV2
— MethodCreateArenaCfgV2(
api::ONNXRunTime.CAPI.OrtApi,
keys,
vals
) -> ONNXRunTime.CAPI.OrtArenaCfg
ONNXRunTime.CAPI.CreateCpuMemoryInfo
— MethodCreateCpuMemoryInfo(
api::ONNXRunTime.CAPI.OrtApi;
allocator_type,
mem_type
) -> ONNXRunTime.CAPI.OrtMemoryInfo
ONNXRunTime.CAPI.CreateEnv
— MethodCreateEnv(api::ONNXRunTime.CAPI.OrtApi; logging_level, name)
ONNXRunTime.CAPI.CreateRunOptions
— MethodCreateRunOptions(
api::ONNXRunTime.CAPI.OrtApi
) -> ONNXRunTime.CAPI.OrtRunOptions
ONNXRunTime.CAPI.CreateSession
— MethodCreateSession(
api::ONNXRunTime.CAPI.OrtApi,
env::ONNXRunTime.CAPI.OrtEnv,
model_path::AbstractString,
options::ONNXRunTime.CAPI.OrtSessionOptions
) -> ONNXRunTime.CAPI.OrtSession
ONNXRunTime.CAPI.CreateSessionOptions
— MethodCreateSessionOptions(
api::ONNXRunTime.CAPI.OrtApi
) -> ONNXRunTime.CAPI.OrtSessionOptions
ONNXRunTime.CAPI.CreateTensorWithDataAsOrtValue
— MethodCreateTensorWithDataAsOrtValue(
api::ONNXRunTime.CAPI.OrtApi,
memory_info::ONNXRunTime.CAPI.OrtMemoryInfo,
data::Vector,
shape
) -> ONNXRunTime.CAPI.OrtValue
Return a tensor with shape shape
that is backed by the memory of data
.
ONNXRunTime.CAPI.Free
— MethodFree(
alloc::ONNXRunTime.CAPI.OrtAllocator,
ptr::Union{Cstring, Ptr}
)
ONNXRunTime.CAPI.GetApi
— FunctionGetApi(
api_base::ONNXRunTime.CAPI.OrtApiBase
) -> ONNXRunTime.CAPI.OrtApi
GetApi(
api_base::ONNXRunTime.CAPI.OrtApiBase,
ort_api_version::Integer
) -> ONNXRunTime.CAPI.OrtApi
ONNXRunTime.CAPI.GetDimensions
— FunctionGetDimensions(
api::ONNXRunTime.CAPI.OrtApi,
o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
) -> Vector{Int64}
GetDimensions(
api::ONNXRunTime.CAPI.OrtApi,
o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo,
ndims
) -> Vector{Int64}
ONNXRunTime.CAPI.GetDimensionsCount
— MethodGetDimensionsCount(
api::ONNXRunTime.CAPI.OrtApi,
o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
) -> UInt64
ONNXRunTime.CAPI.GetErrorMessage
— MethodGetErrorMessage(
api::ONNXRunTime.CAPI.OrtApi,
status::Ptr{Nothing}
) -> String
ONNXRunTime.CAPI.GetTensorElementType
— MethodGetTensorElementType(
api::ONNXRunTime.CAPI.OrtApi,
o::ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
) -> ONNXRunTime.CAPI.ONNXTensorElementDataType
ONNXRunTime.CAPI.GetTensorMutableData
— MethodGetTensorMutableData(
api::ONNXRunTime.CAPI.OrtApi,
tensor::ONNXRunTime.CAPI.OrtValue
) -> PermutedDimsArray{T, N, perm, iperm, Array{T1, N1}} where {T, N, perm, iperm, N1, T1}
ONNXRunTime.CAPI.GetTensorTypeAndShape
— MethodGetTensorTypeAndShape(
api::ONNXRunTime.CAPI.OrtApi,
o::ONNXRunTime.CAPI.OrtValue
) -> ONNXRunTime.CAPI.OrtTensorTypeAndShapeInfo
ONNXRunTime.CAPI.GetVersionString
— MethodGetVersionString(
api_base::ONNXRunTime.CAPI.OrtApiBase
) -> String
ONNXRunTime.CAPI.IsTensor
— MethodIsTensor(
api::ONNXRunTime.CAPI.OrtApi,
val::ONNXRunTime.CAPI.OrtValue
) -> Bool
ONNXRunTime.CAPI.OrtGetApiBase
— MethodOrtGetApiBase(
;
execution_provider
) -> ONNXRunTime.CAPI.OrtApiBase
ONNXRunTime.CAPI.Run
— MethodRun(
api::ONNXRunTime.CAPI.OrtApi,
session::ONNXRunTime.CAPI.OrtSession,
run_options::Union{Nothing, ONNXRunTime.CAPI.OrtRunOptions},
input_names::Vector{String},
inputs::Vector{ONNXRunTime.CAPI.OrtValue},
output_names::Vector{String}
) -> Vector{ONNXRunTime.CAPI.OrtValue}
ONNXRunTime.CAPI.SessionGetInputCount
— MethodSessionGetInputCount(
api::ONNXRunTime.CAPI.OrtApi,
session::ONNXRunTime.CAPI.OrtSession
) -> UInt64
ONNXRunTime.CAPI.SessionGetInputName
— MethodSessionGetInputName(
api::ONNXRunTime.CAPI.OrtApi,
session::ONNXRunTime.CAPI.OrtSession,
index::Integer,
allocator::ONNXRunTime.CAPI.OrtAllocator
) -> String
ONNXRunTime.CAPI.SessionGetModelMetadata
— MethodSessionGetModelMetadata(
api::ONNXRunTime.CAPI.OrtApi,
session::ONNXRunTime.CAPI.OrtSession
) -> ONNXRunTime.CAPI.OrtModelMetadata
ONNXRunTime.CAPI.SessionGetOutputCount
— MethodSessionGetOutputCount(
api::ONNXRunTime.CAPI.OrtApi,
sess::ONNXRunTime.CAPI.OrtSession
) -> UInt64
ONNXRunTime.CAPI.SessionGetOutputName
— MethodSessionGetOutputName(
api::ONNXRunTime.CAPI.OrtApi,
session::ONNXRunTime.CAPI.OrtSession,
index::Integer,
allocator::ONNXRunTime.CAPI.OrtAllocator
) -> String
ONNXRunTime.CAPI.SessionOptionsAppendExecutionProvider_CUDA
— MethodSessionOptionsAppendExecutionProvider_CUDA(
api::ONNXRunTime.CAPI.OrtApi,
session_options::ONNXRunTime.CAPI.OrtSessionOptions,
cuda_options::ONNXRunTime.CAPI.OrtCUDAProviderOptions
)
ONNXRunTime.CAPI.into_julia
— Methodinto_julia(
_::Type{T},
api::ONNXRunTime.CAPI.OrtApi,
objptr::Ref{Ptr{Nothing}},
status_ptr::Ptr{Nothing},
gchandles
) -> Any
Create a julia object from the output of an api call. Check and release status_ptr.
ONNXRunTime.CAPI.juliatype
— Methodjuliatype(
onnx::ONNXRunTime.CAPI.ONNXTensorElementDataType
) -> Type
ONNXRunTime.CAPI.release
— Functionrelease(api::OrtApi, obj)::Nothing
Release memory owned by obj
. The garbage collector should call this function automatically. If it does not that's a bug that should be reported.
There might however be situations with high memory pressure. In these situations it might help to call this function manually to release memory earlier. Using an object after releasing it is undefined behaviour.
ONNXRunTime.CAPI.unsafe_GetTensorMutableData
— Methodunsafe_GetTensorMutableData(
api::ONNXRunTime.CAPI.OrtApi,
tensor::ONNXRunTime.CAPI.OrtValue
) -> PermutedDimsArray{T, N, perm, iperm, Array{T1, N1}} where {T, N, perm, iperm, N1, T1}
This function is unsafe, because its output points to memory owned by tensor
. After tensor
is released, accessing the output becomes undefined behaviour.