Skip to content

airt db

A set of commands for importing and processing data from sources such as CSV/parquet files, databases, AWS S3 buckets, and Azure Blob Storage.

Usage:

$ airt db [OPTIONS] COMMAND [ARGS]...

Options:

  • --install-completion [bash|zsh|fish|powershell|pwsh]: Install completion for the specified shell.
  • --show-completion [bash|zsh|fish|powershell|pwsh]: Show completion for the specified shell, to copy it or customize the installation.
  • --help: Show this message and exit.

Commands:

  • details: Return details of a datablob.
  • from-azure-blob-storage: Create and return a datablob that...
  • from-clickhouse: Create and return a datablob that...
  • from-local: Create and return a datablob from local...
  • from-mysql: Create and return a datablob that...
  • from-s3: Create and return a datablob that...
  • ls: Return the list of datablobs.
  • rm: Delete a datablob from the server.
  • tag: Tag an existing datablob in the server.
  • to-datasource: Process the datablob and return a...

airt db details

Return details of a datablob.

Usage:

$ airt db details [OPTIONS] UUID

Arguments:

  • UUID: Datablob uuid. [required]

Options:

  • -f, --format TEXT: Format output and show only the given column(s) values.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db from-azure-blob-storage

Create and return a datablob that encapsulates the data from an Azure Blob Storage.

Usage:

$ airt db from-azure-blob-storage [OPTIONS] URI

Arguments:

  • URI: Azure Blob Storage URI of the source file. [required]

Options:

  • -c, --credential TEXT: Credential to access the Azure Blob Storage. [required]
  • -cp, --cloud-provider TEXT: The destination cloud storage provider's name to store the datablob. Currently, the API only supports aws and azure as cloud storage providers. If None (default value), then azure will be used as the cloud storage provider.
  • -r, --region TEXT: The destination cloud provider's region to save your datablob. If None (default value) then the default region will be assigned based on the cloud provider. In the case of aws, eu-west-1 will be used and in the case of azure, westeurope will be used. The supported AWS regions are: ap-northeast-1, ap-northeast-2, ap-south-1, ap-southeast-1, ap-southeast-2, ca-central-1, eu-central-1, eu-north-1, eu-west-1, eu-west-2, eu-west-3, sa-east-1, us-east-1, us-east-2, us-west-1, us-west-2. The supported Azure Blob Storage regions are: australiacentral, australiacentral2, australiaeast, australiasoutheast, brazilsouth, canadacentral, canadaeast, centralindia, centralus, eastasia, eastus, eastus2, francecentral, francesouth, germanynorth, germanywestcentral, japaneast, japanwest, koreacentral, koreasouth, northcentralus, northeurope, norwayeast, norwaywest, southafricanorth, southafricawest, southcentralus, southeastasia, southindia, switzerlandnorth, switzerlandwest, uaecentral, uaenorth, uksouth, ukwest, westcentralus, westeurope, westindia, westus, westus2.
  • -t, --tag TEXT: A string to tag the datablob. If not passed, then the tag latest will be assigned to the datablob.
  • -q, --quiet: Output datablob uuid only.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db from-clickhouse

Create and return a datablob that encapsulates the data from a ClickHouse database.

If the database requires authentication, pass the username/password as commandline arguments or store it in the CLICKHOUSE_USERNAME and CLICKHOUSE_PASSWORD environment variables.

Usage:

$ airt db from-clickhouse [OPTIONS]

Options:

  • --host TEXT: Remote database host name. [required]
  • --database TEXT: Database name. [required]
  • --table TEXT: Table name. [required]
  • --protocol TEXT: Protocol to use. The valid values are "native" and "http". [required]
  • --index-column TEXT: The column to use as index (row labels). [required]
  • --timestamp-column TEXT: Timestamp column name in the tabel. [required]
  • --port INTEGER: Host port number. If not passed, then the default value 0 will be used. [default: 0]
  • -cp, --cloud-provider TEXT: The destination cloud storage provider's name to store the datablob. Currently, the API only supports aws and azure as cloud storage providers. If None (default value), then aws will be used as the cloud storage provider.
  • -r, --region TEXT: The destination cloud provider's region to save your datablob. If None (default value) then the default region will be assigned based on the cloud provider. In the case of aws, eu-west-1 will be used and in the case of azure, westeurope will be used. The supported AWS regions are: ap-northeast-1, ap-northeast-2, ap-south-1, ap-southeast-1, ap-southeast-2, ca-central-1, eu-central-1, eu-north-1, eu-west-1, eu-west-2, eu-west-3, sa-east-1, us-east-1, us-east-2, us-west-1, us-west-2. The supported Azure Blob Storage regions are: australiacentral, australiacentral2, australiaeast, australiasoutheast, brazilsouth, canadacentral, canadaeast, centralindia, centralus, eastasia, eastus, eastus2, francecentral, francesouth, germanynorth, germanywestcentral, japaneast, japanwest, koreacentral, koreasouth, northcentralus, northeurope, norwayeast, norwaywest, southafricanorth, southafricawest, southcentralus, southeastasia, southindia, switzerlandnorth, switzerlandwest, uaecentral, uaenorth, uksouth, ukwest, westcentralus, westeurope, westindia, westus, westus2.
  • -u, --username TEXT: Database username. If not passed, the default value 'root' will be used unless the value is explicitly set in the environment variable CLICKHOUSE_USERNAME.
  • -p, --password TEXT: Database password. If not passed, the default value '' will be used unless the value is explicitly set in the environment variable CLICKHOUSE_PASSWORD.
  • -f, --filters-json TEXT: Additional parameters to be used when importing data. For example, if you want to filter and extract data only for a specific user_id, pass '{"user_id": 1}'.
  • -t, --tag TEXT: A string to tag the datablob. If not passed, then the tag latest will be assigned to the datablob.
  • -q, --quiet: Output datablob uuid only.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db from-local

Create and return a datablob from local csv file.

The API currently allows users to create datablobs from CSV or Parquet files. We intend to support additional file formats in future releases.

Usage:

$ airt db from-local [OPTIONS]

Options:

  • -p, --path TEXT: The relative or absolute path to a local CSV/parquet file or to a directory containing the CSV/parquet files. [required]
  • -cp, --cloud-provider TEXT: The destination cloud storage provider's name to store the datablob. Currently, the API only supports aws and azure as cloud storage providers. If None (default value), then aws will be used as the cloud storage provider.
  • -r, --region TEXT: The destination cloud provider's region to save your datablob. If None (default value) then the default region will be assigned based on the cloud provider. In the case of aws, eu-west-1 will be used and in the case of azure, westeurope will be used. The supported AWS regions are: ap-northeast-1, ap-northeast-2, ap-south-1, ap-southeast-1, ap-southeast-2, ca-central-1, eu-central-1, eu-north-1, eu-west-1, eu-west-2, eu-west-3, sa-east-1, us-east-1, us-east-2, us-west-1, us-west-2. The supported Azure Blob Storage regions are: australiacentral, australiacentral2, australiaeast, australiasoutheast, brazilsouth, canadacentral, canadaeast, centralindia, centralus, eastasia, eastus, eastus2, francecentral, francesouth, germanynorth, germanywestcentral, japaneast, japanwest, koreacentral, koreasouth, northcentralus, northeurope, norwayeast, norwaywest, southafricanorth, southafricawest, southcentralus, southeastasia, southindia, switzerlandnorth, switzerlandwest, uaecentral, uaenorth, uksouth, ukwest, westcentralus, westeurope, westindia, westus, westus2.
  • -t, --tag TEXT: A string to tag the datablob. If not passed, then the tag latest will be assigned to the datablob.
  • -q, --quiet: Output data id only.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db from-mysql

Create and return a datablob that encapsulates the data from a mysql database.

If the database requires authentication, pass the username/password as commandline arguments or store it in the AIRT_CLIENT_DB_USERNAME and AIRT_CLIENT_DB_PASSWORD environment variables.

Usage:

$ airt db from-mysql [OPTIONS]

Options:

  • --host TEXT: Remote database host name. [required]
  • --database TEXT: Database name. [required]
  • --table TEXT: Table name. [required]
  • --port INTEGER: Host port number. If not passed, then the default value 3306 will be used. [default: 3306]
  • -cp, --cloud-provider TEXT: The destination cloud storage provider's name to store the datablob. Currently, the API only supports aws and azure as cloud storage providers. If None (default value), then aws will be used as the cloud storage provider.
  • -r, --region TEXT: The destination cloud provider's region to save your datablob. If None (default value) then the default region will be assigned based on the cloud provider. In the case of aws, eu-west-1 will be used and in the case of azure, westeurope will be used. The supported AWS regions are: ap-northeast-1, ap-northeast-2, ap-south-1, ap-southeast-1, ap-southeast-2, ca-central-1, eu-central-1, eu-north-1, eu-west-1, eu-west-2, eu-west-3, sa-east-1, us-east-1, us-east-2, us-west-1, us-west-2. The supported Azure Blob Storage regions are: australiacentral, australiacentral2, australiaeast, australiasoutheast, brazilsouth, canadacentral, canadaeast, centralindia, centralus, eastasia, eastus, eastus2, francecentral, francesouth, germanynorth, germanywestcentral, japaneast, japanwest, koreacentral, koreasouth, northcentralus, northeurope, norwayeast, norwaywest, southafricanorth, southafricawest, southcentralus, southeastasia, southindia, switzerlandnorth, switzerlandwest, uaecentral, uaenorth, uksouth, ukwest, westcentralus, westeurope, westindia, westus, westus2.
  • -u, --username TEXT: Database username. If not passed, the default value "root" will be used unless the value is explicitly set in the environment variable AIRT_CLIENT_DB_USERNAME.
  • -p, --password TEXT: Database password. If not passed, the default value "" will be used unless the value is explicitly set in the environment variable AIRT_CLIENT_DB_PASSWORD.
  • -t, --tag TEXT: A string to tag the datablob. If not passed, then the tag latest will be assigned to the datablob.
  • -q, --quiet: Output datablob uuid only.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db from-s3

Create and return a datablob that encapsulates the data from an AWS S3 bucket.

Usage:

$ airt db from-s3 [OPTIONS] URI

Arguments:

  • URI: The AWS S3 bucket uri. [required]

Options:

  • --access-key TEXT: Access key for the S3 bucket. If None (default value), then the value from AWS_ACCESS_KEY_ID environment variable is used.
  • --secret-key TEXT: Secret key for the S3 bucket. If None (default value), then the value from AWS_SECRET_ACCESS_KEY environment variable is used.
  • -cp, --cloud-provider TEXT: The destination cloud storage provider's name to store the datablob. Currently, the API only supports aws and azure as cloud storage providers. If None (default value), then aws will be used as the cloud storage provider.
  • -r, --region TEXT: The destination cloud provider's region to save your datablob. If None (default value) then the default region will be assigned based on the cloud provider. In the case of aws, the datablob's source bucket region will be used and in the case of azure, westeurope will be used. The supported AWS regions are: ap-northeast-1, ap-northeast-2, ap-south-1, ap-southeast-1, ap-southeast-2, ca-central-1, eu-central-1, eu-north-1, eu-west-1, eu-west-2, eu-west-3, sa-east-1, us-east-1, us-east-2, us-west-1, us-west-2. The supported Azure Blob Storage regions are: australiacentral, australiacentral2, australiaeast, australiasoutheast, brazilsouth, canadacentral, canadaeast, centralindia, centralus, eastasia, eastus, eastus2, francecentral, francesouth, germanynorth, germanywestcentral, japaneast, japanwest, koreacentral, koreasouth, northcentralus, northeurope, norwayeast, norwaywest, southafricanorth, southafricawest, southcentralus, southeastasia, southindia, switzerlandnorth, switzerlandwest, uaecentral, uaenorth, uksouth, ukwest, westcentralus, westeurope, westindia, westus, westus2.
  • -t, --tag TEXT: A string to tag the datablob. If not passed, then the tag latest will be assigned to the datablob.
  • -q, --quiet: Output datablob uuid only.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db ls

Return the list of datablobs.

Usage:

$ airt db ls [OPTIONS]

Options:

  • -o, --offset INTEGER: The number of datablobs to offset at the beginning. If None, then the default value 0 will be used. [default: 0]
  • -l, --limit INTEGER: The maximum number of datablobs to return from the server. If None, then the default value 100 will be used. [default: 100]
  • --disabled: If set to True, then only the deleted datablobs will be returned.Else, the default value False will be used to return only the listof active datablobs.
  • --completed: If set to True, then only the datablobs that are successfully downloadedto the server will be returned. Else, the default value False will be used toreturn all the datablobs.
  • -f, --format TEXT: Format output and show only the given column(s) values.
  • -q, --quiet: Output only datablob uuids separated by space
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db rm

Delete a datablob from the server.

Usage:

$ airt db rm [OPTIONS] UUID

Arguments:

  • UUID: Datablob uuid. [required]

Options:

  • -f, --format TEXT: Format output and show only the given column(s) values.
  • -q, --quiet: Output the deleted datablob uuid only.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db tag

Tag an existing datablob in the server.

Usage:

$ airt db tag [OPTIONS]

Options:

  • -uuid, --datablob_uuid TEXT: Datablob uuid in the server. [required]
  • -n, --name TEXT: A string to tag the datablob. [required]
  • -f, --format TEXT: Format output and show only the given column(s) values.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.

airt db to-datasource

Process the datablob and return a datasource object.

Usage:

$ airt db to-datasource [OPTIONS]

Options:

  • --uuid TEXT: Datablob uuid. [required]
  • --file-type TEXT: The file type of the datablob. Currently, the API only supports "csv" and "parquet" as file types. [required]
  • --index-column TEXT: The column to use as index (row labels). [required]
  • --sort-by TEXT: The column(s) to sort the data. Can either be a string or a JSON encoded list of strings. [required]
  • --deduplicate-data / --no-deduplicate-data: If set to True (default value False), the datasource will be created with duplicate rows removed. [default: no-deduplicate-data]
  • --blocksize TEXT: The number of bytes used to split larger files. If None, then the default value 256MB will be used. [default: 256MB]
  • --kwargs-json TEXT: Additional JSON encoded dict arguments to use while processing the data.e.g: To skip 100 lines from the bottom of file, pass '{"skipfooter": 100}'
  • -q, --quiet: Output datasource uuid only.
  • -d, --debug: Set logger level to DEBUG and output everything.
  • --help: Show this message and exit.