Dynamic Categorization
Dynamic categorization best fits product information such as title and description to a product taxonomy tree or categories list. The baseline dynamic categorization endpoint does not need to be tuned for your specific categories as it's learned a relationship across different taxonomy trees.
A fine-tuned dynamic categorization model is a newly trained model that focuses on fitting product data to a specific taxonomy tree or categories list. Once the model has been fine-tuned on your specific tree it cannot change. The model has been "steered" to your specific use case and products. The steering leads to a much higher accuracy for your specific products and catalog information.
API information for batch-categorize endpoint:
import requestsimport jsonurl = 'https://app.pumice.ai/api/batch-categorize-products'payload = json.dumps({"csv_url": 'https://raw.githubusercontent.com/Pumice/Test/master/amazon_data_10.csv', "run_type": "dynamic", "tree_id": 28, "model_id": "my_custom_model", "csv_id": None})headers = {'KEY': '<YOUR_API_KEY>','Content-Type': 'application/json'}response = requests.request("POST", url, headers=headers, data=payload)print(response.text)
Route
https://app.pumice.ai/api/batch-categorize-products
Return
This endpoint returns a task_id that allows you to access the batch run when it's finished. Use /get-results to fetch with task_id.
Required Headers
API key and content type are required.
{'KEY': '<YOUR_API_KEY>','Content-Type': 'application/json'}
csv_url
URL of a publicly available CSV containing product information to categorize. This allows you to simply access a live URL such as s3 or a database. Required fields are "Title" and "Description". This field must be used if a CSV has not already been uploaded to Pumice.ai and accessed via "csv_id". One of these two fields is required, but not both.
Type: string
Default: None
run_type
Choose which categorization endpoint you want to use. Options are static and dynamic.
Type: string
Default: None
Example Payload:
{csv_id: 5,tree_id: 28,run_type: “dynamic”,model_id: “google”}
tree_id
Id from an uploaded taxonomy tree. This id comes from the upload tree endpoint: https://app.pumice.ai/api/upload-tree
.
This allows you to upload and store your trees for processes like this one.
Type: int
Default: None
model_id
Id used to access your fine-tuned model or any fine-tuned models available across plans. If you are a fine-tuned or enterprise user and do not know your model_id please reach out to support. Model_id is case sensitive.
Type: string
Default: None
csv_id
Id from an uploaded CSV. This id comes from the upload CSV endpoint: https://app.pumice.ai/api/upload-csv
.
This allows you to upload and store your CSVs for processes like this one. This field is not required if you're using the csv_url field.
Type: int
Default: None
Code Examples
Python functions to help you process the output batch products json.
Pumice.ai formatted json to pandas function:
def pumice_json_to_pandas(filename): with open(filename) as f: data = json.load(f)['data'] return pd.DataFrame(data)
Pumice.ai formatted json to a csv file:
import csvdef pumice_json_to_csv(filename, csv_file): with open(filename) as f: data = json.load(f)['data'][0] print(data) with open(csv_file, 'w') as csvfile: writer = csv.DictWriter(csvfile, fieldnames=data.keys()) writer.writeheader() for row in data: writer.writerow(row) return 'Success!'
Pumice.ai pandas dataframe to csv:
def pumice_pandas_to_csv(pumice_df, csv_file): pumice_df.to_csv(csv_file) return 'Success!'