Dynamic Categorization

Dynamic categorization best fits product information such as title and description to a product taxonomy tree or categories list. The baseline dynamic categorization endpoint does not need to be tuned for your specific categories as it's learned a relationship across different taxonomy trees.

A fine-tuned dynamic categorization model is a newly trained model that focuses on fitting product data to a specific taxonomy tree or categories list. Once the model has been fine-tuned on your specific tree it cannot change. The model has been "steered" to your specific use case and products. The steering leads to a much higher accuracy for your specific products and catalog information.

API information for batch-categorize endpoint:

import requests
import json
url = 'https://app.pumice.ai/api/batch-categorize-products'
payload = json.dumps({
"csv_url": 'https://raw.githubusercontent.com/Pumice/Test/master/amazon_data_10.csv',
"run_type": "dynamic",
"tree_id": 28,
"model_id": "my_custom_model",
"csv_id": None})
headers = {'KEY': '<YOUR_API_KEY>','Content-Type': 'application/json'}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)

Route

https://app.pumice.ai/api/batch-categorize-products

Return

This endpoint returns a task_id that allows you to access the batch run when it's finished. Use /get-results to fetch with task_id.

Required Headers

API key and content type are required.

{'KEY': '<YOUR_API_KEY>','Content-Type': 'application/json'}

csv_url

URL of a publicly available CSV containing product information to categorize. This allows you to simply access a live URL such as s3 or a database. Required fields are "Title" and "Description". This field must be used if a CSV has not already been uploaded to Pumice.ai and accessed via "csv_id". One of these two fields is required, but not both.

Type: string
Default: None

run_type

Choose which categorization endpoint you want to use. Options are static and dynamic.

Type: string
Default: None

Example Payload:

{
csv_id: 5,
tree_id: 28,
run_type: “dynamic”,
model_id: “google”
}

tree_id

Id from an uploaded taxonomy tree. This id comes from the upload tree endpoint: https://app.pumice.ai/api/upload-tree. This allows you to upload and store your trees for processes like this one.

Type: int
Default: None

model_id

Id used to access your fine-tuned model or any fine-tuned models available across plans. If you are a fine-tuned or enterprise user and do not know your model_id please reach out to support. Model_id is case sensitive.

Type: string
Default: None

csv_id

Id from an uploaded CSV. This id comes from the upload CSV endpoint: https://app.pumice.ai/api/upload-csv. This allows you to upload and store your CSVs for processes like this one. This field is not required if you're using the csv_url field.

Type: int
Default: None

Code Examples

Python functions to help you process the output batch products json.

Pumice.ai formatted json to pandas function:

def pumice_json_to_pandas(filename):
with open(filename) as f:
data = json.load(f)['data']
return pd.DataFrame(data)

Pumice.ai formatted json to a csv file:

import csv
def pumice_json_to_csv(filename, csv_file):
with open(filename) as f:
data = json.load(f)['data'][0]
print(data)
with open(csv_file, 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=data.keys())
writer.writeheader()
for row in data:
writer.writerow(row)
return 'Success!'

Pumice.ai pandas dataframe to csv:

def pumice_pandas_to_csv(pumice_df, csv_file):
pumice_df.to_csv(csv_file)
return 'Success!'