tensorRT stuff
tensorRT support matrix: https://docs.nvidia.com/deeplearning/dgx/integrate-tf-trt/index.html#matrix
to apply the tensorRT optimizations, it needs to call create_inference_graph
function. Check here for more details on this function.
the graph that is fed to create_inference_graph
should be freezed. To know more on what exactly means by “freezing”, check here.
for using bare tensorRT python module, check out here.