# Configuring processing and parallelism

The following environment variables control job and file processing.

## Configuring the number of jobs to run concurrently <a href="#config-concurrent-jobs" id="config-concurrent-jobs"></a>

The number of jobs that can run concurrently can affect the [number of Textual workers](https://docs.tonic.ai/textual/textual-install-administer/configuring-textual/general-instance-and-processing-settings/textual-config-ml-worker-count) that you need. The more jobs that can run concurrently, the fewer workers that are needed.

Textual provides a set of environment variables to control the number of jobs that each Textual worker can run at the same time.

* A global setting to control the total number of jobs across all job types.
* Individual settings to control the number of concurrent jobs for specific job types.

### Configuring the global limit for concurrent jobs <a href="#concurrent-jobs-global-limit" id="concurrent-jobs-global-limit"></a>

The environment variable `SOLAR_MAX_CONCURRENT_WORKER_JOBS` controls the number of jobs that can run concurrently across all of the job types.

The default value is 16.

### Configuring the limits for specific job types <a href="#concurrent-jobs-limit-by-type" id="concurrent-jobs-limit-by-type"></a>

The following environment variables control the number of jobs that can run concurrently for specific types of jobs.

* `SOLAR_MAX_CONCURRENT_JOBS_DEIDENTIFY_FILE` \
  \
  Default value: 8\
  \
  Includes the following actions:
  * Upload a file to a dataset
  * Rescan a dataset file that fails to process
  * Rescan a dataset file after a custom entity type is created or edited
  * Rescan a dataset file after an update to the dataset configuration
  * Textual uploads a cloud storage file for processing
* `SOLAR_MAX_CONCURRENT_JOBS_DEIDENTIFY_UNATTACHED_FILE` \
  \
  Default value: 8\
  \
  Includes the following actions:
  * Upload a file to the **Home** page preview tool
  * In the SDK, make a call to `redact.start_file_redaction`
* `SOLAR_MAX_CONCURRENT_JOBS_PARSE_FILES` \
  \
  Default value: 8\
  \
  Includes the following actions:
  * Run parsing jobs
* `SOLAR_MAX_CONCURRENT_JOBS_AUDIO_TRANSCRIPTION` \
  \
  Default value: 8\
  \
  Includes the following actions:
  * Transcribe and redact an audio file
* `SOLAR_MAX_CONCURRENT_JOBS_PROCESS_EXTERNAL_FILES` \
  \
  Default value: 8\
  \
  Includes the following actions:
  * Synchronize cloud storage files for a dataset
* `SOLAR_MAX_CONCURRENT_JOBS_GENERATE_EXTERNAL_FILES` \
  \
  Default value: 8\
  \
  Includes the following actions:
  * Generate output files for a cloud storage dataset
* `SOLAR_MAX_CONCURRENT_JOBS_MANUAL_REDACTION_PROCESS_FILE`\
  \
  Default value: 8\
  \
  Includes the following actions:
  * Scan a file in a guided redaction project
* `SOLAR_MAX_CONCURRENT_JOBS_FILE_ANNOTATION` \
  \
  Default value: 5\
  \
  Includes the following actions:
  * For a model-based custom entity type, identify the entity type values in the test data.
* `SOLAR_MAX_CONCURRENT_JOBS_GENERATE_NEW_GUIDELINES` \
  \
  Default value: 2\
  \
  Includes the following actions:
  * For a model-based custom entity type, generate recommendations to improve the guidelines.
* `SOLAR_MAX_CONCURRENT_JOBS_TRAINING_FILE_ANALYSIS` \
  \
  Default value: 4\
  \
  Includes the following actions:
  * For a model-based custom entity type, analyze the training files.
* `SOLAR_MAX_CONCURRENT_JOBS_TRAINING_FILE_ANNOTATION` \
  \
  Default value: 4\
  \
  Includes the following actions:
  * For a model-based custom entity type, use the selected guidelines version to identify the entity values in the training data.

## Configuring the size of the datetime generator cache <a href="#config-datetime-cache" id="config-datetime-cache"></a>

When it generates datetime values, to optimize the processing, Textual stores the redacted datetime values in a cache.

To change the cache size, configure the environment variable `SOLAR_DATETIME_GENERATOR_CACHE_CAPACITY`.

The default value is 100000, meaning that the cache contains 100,000 values.

Note that while increasing the size of the cache can speed up processing, it also uses more RAM.

## Configuring the number of PDF pages to redact simultaneously <a href="#config-pdf-page-parallelism" id="config-pdf-page-parallelism"></a>

When Textual redacts PDF files so that a user can preview or download the output, the following environment variable determines the number of pages that it processes simultaneously:

`SOLAR_PDF_PAGE_REDACTION_PARALLELISM`

The default value is 4, meaning that Textual processes 4 pages at a time.

## Configuring the number of PDF files to plan simultaneously <a href="#config-pdf-plan-parallelism" id="config-pdf-plan-parallelism"></a>

When Textual plans the redaction of PDF files for a user to preview or download, the following environment variable determines the number of files that it plans simultaneously.

`SOLAR_PDF_DOC_PLAN_PARALLELISM`

The default value is 3, meaning that Textual plans 3 PDF files at a time.

## Configuring how often to purge cached PDF pages <a href="#config-pdf-page-cache-purge" id="config-pdf-page-cache-purge"></a>

When it redacts PDF files, Textual stores the redacted PDF pages in a cache.

The following environment variable determines how often Textual purges the cache of PDF pages.

`PURGE_REDACTED_PAGES_IN_HOURS`

The default value is 12, meaning that Textual purges the redacted PDF pages cache every 12 hours.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tonic.ai/textual/textual-install-administer/configuring-textual/general-instance-and-processing-settings/config-processing-parallelism.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
