Process Sales files

Supported format

We know that there are different file formats that can be used to document sales, but also that there are better formats for analyzing this data; therefore, these are the supported formats for the API:

Restrictions

  • If you upload another kind of file, it would be discard automatically.

  • If the file is not in the correct format, the API will return an error message.

  • If the file format is not ingested in the database, the API will not recognize it and it will be discard automatically.

  • If the file is empty, the API will discard it automatically.

  • If the file is corrupted, the API will discard it automatically.

  • if the input is not complete, the API will return an error message

Upload manually

Our api supports the uploading of files manually. This route takes several modules and routes to be able to work.

The routes to use are the following:

POST /api/files/salesfiles/

This route creates a new file information, that will be useful when that file will be processed

Form Data

Field

Type

Description

filename*

string

It’s the filename will be processed

filetype*

string

It’s the filetype (CSV, TXT, TSV, XLS)

filesize*

integer

It’s the filesize in bits

mongo_db_id*

string

It’s the mongo id to identify data

status

string

Determines the current file status. INITIALIZED by default

total_rows

integer

Total rows in the file. It is calculated after processing

total_artists

integer

Total rows in the file. It is calculated after processing

status_cause

string

Determines the current file status failed

error_message

string

Full error message

total_local

decimal

Total revenues per file. It is calculated after processing

reporting_period

string

Reporting period of current file

upload_mode

string

Determines the current file upload method. MANUALLY by default

response_module

string

PENDING

organisation

integer

Organisation ID who uploads the file

exchange_rate

float

Exchange rate used in the file to calculate revenues

* required fields

Response

It returns the same fields, but adding these following fields

Field

Type

Description

id

integer

Unique ID

module_response

string

Module called to upload current file

uuid

string

Unique 8 characters ID

POST /api/files/salesfiles/{id}/process/

Determines the actions to apply to file(s) according the “action” field

Form Data

Field

Type

Description

id

integer

Unique ID

action

string

Determines the action to do.

  • request_presigned_url: creates a presigned url in S3 using the uploaded file.

  • start_cleaning: starts the cleaning process, preparing the file to importer processing

Response

PENDING

Field

Type

Description

id

integer

Unique ID

presigned_url

string

presigned url for file

When you upload a file manually, it pass through OAS to create the procedure with status = INITIALIZED, then it generates a S3 pah where file will be uploaded. After that, it calls to upload module (Glider route) to normalize and standardize the file so that it can be processed.

If the file passes through this module without issues, a snapshot is created with the initial information of it, and the importer module (Glider route) is called so that it can be processed.

        sequenceDiagram
    actor User
    participant FrontEnd
    participant OAS
    participant Glider
    participant S3
    participant MongoDB

    User->>FrontEnd: Upload File
    activate FrontEnd
    FrontEnd->>OAS: Create file
    activate OAS
    OAS->>OAS: Create file with status INITIALIZED
    OAS-->>FrontEnd: Response status ok
    FrontEnd->>OAS: Request presigned URL
    deactivate OAS

    activate OAS
    OAS->>S3: Generate presigned URL
    S3-->>OAS: Return presigned URL
    OAS-->>FrontEnd: Response presigned url
    deactivate OAS

    FrontEnd->>S3: Send File
    S3-->>FrontEnd: Status Ok

    activate OAS
    FrontEnd->>OAS: Start process cleaning
    OAS->>OAS: update status file to UPLOADED

    activate MongoDB
    OAS->>MongoDB: Save initial information
    MongoDB-->>OAS: Return response OK
    deactivate MongoDB

    OAS->>OAS: update status file CLEANING
    activate Glider
    OAS->>Glider: Activate Cleaning Module
    Glider->>Glider: Module Cleaning
    Glider-->>OAS: Call webhook with success response
    Note right of OAS: process continue if call to webhook is OK
    deactivate Glider

    OAS->>OAS: Change status file to CLEANED

    activate Glider
    OAS->>Glider: Activate IMPORTER MODULE
    Glider->>Glider: Process information

    activate MongoDB
    Glider->>MongoDB: Save snapshot and sales
    MongoDB-->>Glider: Return response
    deactivate MongoDB

    Glider-->>OAS: Call webhook with success response

    activate MongoDB
    OAS->>MongoDB: Retrieve Snapshots generated
    MongoDB-->>OAS: Response Data
    deactivate MongoDB

    OAS->>OAS: Update status to PROCESSED
    deactivate Glider
    deactivate OAS
    deactivate FrontEnd
    

Upload by Google Drive

POST /api/files/upload_automatically/

This method allows pass directly a google drive link (single file o folder) and do entire upload/cleaning procedure in each item.

When a sales file is uploaded manually, you must wait until it has been uploaded before calling the route to begin processing.

Restrictions

  • Google API setup required.

  • Shared link should be “Anyone with link” to allow the procedure

  • Only supports the uploading of files in the supported formats.

  • The shared link must no redirect to a zip file. Zip files should be contained in a folder/subfolder instead.

Form Data

Field

Type

Description

url

string

Presigned URL generated by the storage system that

provides temporary access to retrieve file

information.

Response

Field

Type

Description

status

string

Status of uploaded file

Import/Reimport Sales Files

Importing is a term used to describe the processing of sales files. When a file is uploaded manually, you must wait until it has been uploaded via the pre-signed link to activate the manual import.

This is different from files uploaded automatically via a link, such as one from Google Drive.

Importer Processing

POST /api/files/salesfiles/{id}/process/

Once file(s) are sync with our servers, the importer process will be triggered. This process will be responsible for importing the files. The procedure consist in process file, normalize data, generate file.parquet and stored in s3 and generate snapshots in MONGO DB.

This step is the core of teh service, because once data is processed, it can be use for several purposes.

Form Data

Field

Type

Description

action

string

Indicates what is de action to take

In this case, it must to be start_cleaning

Response

Field

Type

Description

status

string

Status of processing file

        sequenceDiagram
  participant OAS as OAS System
  participant Glider as Importer Module
  participant AWS S3
  participant MongoDB

  Note over OAS,MongoDB: Importer Procedure

  OAS->>Glider: Call IMPORTER module
  Glider->>Glider: Process file and generate .parquet file
  Glider->>AWS S3: Store .parquet file in S3
  Glider->>MongoDB: Generate snapshots in MongoDB
  Glider->>OAS: Notify processing finish notification
    

Reimport Processing

POST /api/files/salesfiles/{id}/process/

This functionality is useful in any of the following situations:

  • The import failed due to a system issue.

  • The data is incorrect (In this case, re-importing could generate the same errors again; it is recommended to contact the administrator to investigate the root cause in detail.)

Form Data

Field

Type

Description

action

string

Indicates what is de action to take

In this case, it must to be reimport

Response

Field

Type

Description

status

string

Status of processing file

Status Files

The following describes the possible statuses that a sales file can have in OAS.

Field

Description

INITIALIZED

File processing is initialized

UPLOADING

File is currently uploading

UPLOADED

File has been uploaded in our server

CLEANING

File is currently cleaning (preparing for processing)

CLEANED

File has been cleaned

PROCESSING

File is currently processing

PROCESSED

File has been processed

FINALIZED

File processing has been finished successfully

FAILED

File processing has been failed

REIMPORT

File has been reimported

Report process

One of the main functions that can be performed after processing the data from the importer is the generation of reports in CSV format. The reports can be generated in two types:

  • Monthly: These refer to the reports that contain the sales made in a given period and that are also integrated into the catalog.

  • Unmatched: These reports contain the data (sales) that were made in a given period but are not integrated into the catalog.

Any report is send to the user via presigned link with 24 hours expiration.

        sequenceDiagram
  participant OAS as OAS System
  participant Glider as Report Module
  participant AWS S3
  participant MongoDB

  Note over OAS,MongoDB: Report Procedure

  OAS->>Glider: Call REPORT module
  Glider->>MongoDB: Request sales in X period
  MongoDB->>Glider: Returns all sales in X period
  Glider->>Glider: Process sales and generate .csv.gz file
  Glider->>AWS S3: Store report.csv.gz file in S3
  AWS S3->>Glider: Generates presigned link
  Glider->>MongoDB: Generate Report snapshot in MongoDB
  Glider->>OAS: Notify processing finish status
    

Summary revenue by file

In this section, we can find the revenues generated by file, once they have been processed. You can also find information about the revenues by artist, album, track, and DSP per single file.

To consult it, a GET request can be sent to the following route indicating the snapshot ID.

GET /api/files/salesfiles/{id}/snapshot/

It returns the total revenues per Artists, DSP, Tracks and Releases.

Form Data

Field

Type

Description

id*

integer

Unique ID

* required fields

Response

Field

Type

Description

byArtist

list

List of dictionaries which contains the total revenue per artist

byDSP

list

List of dictionaries which contains the total revenue per DSP (service)

byTerritory

list

List of dictionaries which contains the total revenue per territory

byTrack

list

List of dictionaries which contains the total revenue per track

byRelease

list

List of dictionaries which contains the total revenue per release

Summary revenue all time

In the same way as in the previous section, but in this case, you can find the information on total revenues over a period of X. This also contains key information that helps with data processing.

To consult it, a GET request can be sent to the following route indicating the ID of the path where all the processed files have been uploaded.

It returns a general counting per active organization.

GET /api/sales/summary/

Response

Field

Type

Description

files

integer

Total number of files processed

artists

integer

Total number of artist

releases

integer

Total number of releases

tracks

integer

Total number of tracks

total_gross

integer

Total gross generated