Skip to main content
Flash provides several resource configuration classes for different use cases. This reference covers all available parameters and options.

LiveServerless

LiveServerless is the primary configuration class for Flash. It supports full remote code execution, allowing you to run arbitrary Python functions on Runpod’s infrastructure.
from tetra_rp import LiveServerless, GpuGroup, CpuInstanceType, PodTemplate

gpu_config = LiveServerless(
    name="ml-inference",
    gpus=[GpuGroup.AMPERE_80],
    workersMax=5,
    idleTimeout=10,
    template=PodTemplate(containerDiskInGb=100)
)

Parameters

ParameterTypeDescriptionDefault
namestringName for your endpoint (required)-
gpuslist[GpuGroup]GPU pool IDs that can be used by workers[GpuGroup.ANY]
gpuCountintNumber of GPUs per worker1
instanceIdslist[CpuInstanceType]CPU instance types (forces CPU endpoint)None
workersMinintMinimum number of workers0
workersMaxintMaximum number of workers3
idleTimeoutintMinutes before scaling down5
envdictEnvironment variablesNone
networkVolumeIdstringPersistent storage volume IDNone
executionTimeoutMsintMax execution time in milliseconds0 (no limit)
scalerTypestringScaling strategyQUEUE_DELAY
scalerValueintScaling parameter value4
locationsstringPreferred datacenter locationsNone
templatePodTemplatePod template overridesNone

GPU configuration example

from tetra_rp import LiveServerless, GpuGroup, PodTemplate

config = LiveServerless(
    name="gpu-inference",
    gpus=[GpuGroup.AMPERE_80],  # A100 80GB
    gpuCount=1,
    workersMin=0,
    workersMax=5,
    idleTimeout=10,
    template=PodTemplate(containerDiskInGb=100),
    env={"MODEL_ID": "llama-7b"}
)

CPU configuration example

from tetra_rp import LiveServerless, CpuInstanceType

config = LiveServerless(
    name="cpu-processor",
    instanceIds=[CpuInstanceType.CPU5C_4_8],  # 4 vCPU, 8GB RAM
    workersMax=3,
    idleTimeout=5
)

ServerlessEndpoint

ServerlessEndpoint is for GPU workloads that require custom Docker images. Unlike LiveServerless, it only supports dictionary payloads and cannot execute arbitrary Python functions.
from tetra_rp import ServerlessEndpoint, GpuGroup

config = ServerlessEndpoint(
    name="custom-ml-env",
    imageName="pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime",
    gpus=[GpuGroup.AMPERE_80]
)

Parameters

All parameters from LiveServerless are available, plus:
ParameterTypeDescriptionDefault
imageNamestringCustom Docker image-

Limitations

  • Only supports dictionary payloads in the form of {"input": {...}}.
  • Cannot execute arbitrary Python functions remotely.
  • Requires a custom Docker image with a handler that processes the input dictionary.

Example

from tetra_rp import ServerlessEndpoint, GpuGroup

# Custom image with pre-installed models
config = ServerlessEndpoint(
    name="stable-diffusion",
    imageName="my-registry/stable-diffusion:v1.0",
    gpus=[GpuGroup.AMPERE_24],
    workersMax=3
)

# Send requests as dictionaries
result = await config.run({
    "input": {
        "prompt": "A beautiful sunset over mountains",
        "width": 512,
        "height": 512
    }
})

CpuServerlessEndpoint

CpuServerlessEndpoint is for CPU workloads that require custom Docker images. Like ServerlessEndpoint, it only supports dictionary payloads.
from tetra_rp import CpuServerlessEndpoint, CpuInstanceType

config = CpuServerlessEndpoint(
    name="data-processor",
    imageName="python:3.11-slim",
    instanceIds=[CpuInstanceType.CPU5C_4_8]
)

Parameters

ParameterTypeDescriptionDefault
namestringName for your endpoint (required)-
imageNamestringCustom Docker image-
instanceIdslist[CpuInstanceType]CPU instance types-
workersMinintMinimum number of workers0
workersMaxintMaximum number of workers3
idleTimeoutintMinutes before scaling down5
envdictEnvironment variablesNone
networkVolumeIdstringPersistent storage volume IDNone
executionTimeoutMsintMax execution time in milliseconds0 (no limit)

Resource class comparison

FeatureLiveServerlessServerlessEndpointCpuServerlessEndpoint
Remote code execution✅ Full Python function execution❌ Dictionary payload only❌ Dictionary payload only
Custom Docker images❌ Fixed optimized images✅ Any Docker image✅ Any Docker image
Use caseDynamic remote functionsTraditional API endpointsTraditional CPU endpoints
Function returnsAny Python objectDictionary onlyDictionary only
@remote decoratorFull functionalityLimited to payload passingLimited to payload passing

Available GPU types

The GpuGroup enum provides access to GPU pools. Some common options:
GpuGroupDescriptionVRAM
GpuGroup.ANYAny available GPU (default)Varies
GpuGroup.ADA_24RTX 409024GB
GpuGroup.AMPERE_24RTX A5000, L4, RTX 309024GB
GpuGroup.AMPERE_48A40, RTX A600048GB
GpuGroup.AMPERE_80A100 80GB80GB
See GPU types for the complete list of available GPU pools.

Available CPU instance types

The CpuInstanceType enum provides access to CPU configurations:

3rd generation general purpose

CpuInstanceTypeIDvCPURAM
CPU3G_1_4cpu3g-1-414GB
CPU3G_2_8cpu3g-2-828GB
CPU3G_4_16cpu3g-4-16416GB
CPU3G_8_32cpu3g-8-32832GB

3rd generation compute-optimized

CpuInstanceTypeIDvCPURAM
CPU3C_1_2cpu3c-1-212GB
CPU3C_2_4cpu3c-2-424GB
CPU3C_4_8cpu3c-4-848GB
CPU3C_8_16cpu3c-8-16816GB

5th generation compute-optimized

CpuInstanceTypeIDvCPURAM
CPU5C_1_2cpu5c-1-212GB
CPU5C_2_4cpu5c-2-424GB
CPU5C_4_8cpu5c-4-848GB
CPU5C_8_16cpu5c-8-16816GB

PodTemplate

Use PodTemplate to configure additional pod settings:
from tetra_rp import LiveServerless, PodTemplate

config = LiveServerless(
    name="custom-template",
    template=PodTemplate(
        containerDiskInGb=100,
        env=[{"key": "PYTHONPATH", "value": "/workspace"}]
    )
)

Parameters

ParameterTypeDescriptionDefault
containerDiskInGbintContainer disk size in GB20
envlist[dict]Environment variables as key-value pairsNone

Environment variables

Environment variables can be set in two ways:

Using the env parameter

config = LiveServerless(
    name="api-worker",
    env={"HF_TOKEN": "your_token", "MODEL_ID": "gpt2"}
)

Using PodTemplate

config = LiveServerless(
    name="api-worker",
    template=PodTemplate(
        env=[
            {"key": "HF_TOKEN", "value": "your_token"},
            {"key": "MODEL_ID", "value": "gpt2"}
        ]
    )
)
Environment variables are excluded from configuration hashing. Changing environment values won’t trigger endpoint recreation, which allows different processes to load environment variables from .env files without causing false drift detection. Only structural changes (like GPU type, image, or template modifications) trigger endpoint updates.

Next steps