Reference - Data Upload (SFTP)

For information that is not time-sensitive we encourage you to use file dumps to your Froomle SFTP server, containing many separate entries to be processed at once. After reading this reference document, you should have a clear understanding of both how to communicate with the Froomle SFTP and what types of data are a good fit for file dumps.

SFTP Location & Authentication

If a Data Upload integration is chosen, Froomle will set up a dedicated SFTP location for you.

Authentication

You authenticate to the SFTP server using SSH, as shown in the diagram below.

SFTP Communication Diagram
Figure 1. SFTP Communication Diagram

Froomle requires that a public/private SSH2 (RSA) key pair is generated for communication with Froomle only. This key pair should not have a passphrase.

The generated key should be of type SSH2 RSA. When using modern tools to generate an RSA key, the default key type is usually an SSH2 RSA key. In case of any doubt, consult the documentation of the key-gen tool you used.

The private key will live on the server used to export your files and should be stored securely. The public key should be mailed to support@froomle.com. Froomle will then take the necessary actions so that you can access the SFTP server with the feeder account, using your private key. We advise to generate a separate key pair per server if multiple servers will export files to the Froomle SFTP.

The Froomle SFTP server only accepts incoming connections from whitelisted IP ranges. These IP ranges should be delivered by email to support@froomle.com. We only accept connections on port 2222. Note that this is different from the standard port 22 typically used for SFTP.

Target directories

Our SFTP server has four different directories. Each serves a different purpose:

  • periodic: Reserved for files that are uploaded periodically (e.g. once every day). Examples of such files are catalog updates and purchase dumps from physical stores.

  • one_time: Reserved for files that are uploaded only once. This directory is most often used during the project setup to get an initial catalog dump or a historic dump of events.

  • requests: Reserved for batch recommendation requests files. In case of a batch integration this is where the batch request files have to be uploaded. The request filename has to contain the date of the campaign to allow for easier tracking and analysis.

  • responses: Reserved for batch recommendation response files. This is the only directory that is reserved for Froomle to write to. In case of a batch integration this is where our generated response files will be stored. The filename of the response will match the one provided in the requests directory.

These directories are standard and can not be removed/renamed. Creating additional directories is not possible.

Files uploaded to the SFTP server are removed automatically by Froomle after the upload is completed to our internal infrastructure. Files uploaded by Froomle (e.g. in the responses directory) can be removed after you consume them. If not removed, we will automatically remove them after 7 days.

Data delivery

How does Froomle identify my files?

The Froomle platform uses substring matching on the filename to identify what to do with a file. We therefore encourage you to use semantically meaningful names for files of the form <{environment-name}_{type-of-information-to-be-processed}_YYYY-MM-DD.ext>.

An example we will use here is prod_catalog_2020-01-01.csv, which contains a production retail catalog, with an entry for each item the webshop currently has on offer.

When there is only one environment, the environment-name can be omitted. An example case where you have multiple environments is a setup with a QA and production environment. Another example is when you have multiple brands, but user management is shared across brands.

How often should I dump a file?

The frequency of file dumps should be decided in collaboration with the Froomle support team, and is dependent on how quickly your data evolves. For a typical product catalog, a daily dump of new products will often suffice. For flash sales, it is important that every new sale is communicated before it goes live.

What types of data does the Froomle SFTP accept?

Currently we accept JSON and CSV files. If you wish to use a different file type, please contact the Froomle support team to see if this can be supported.

How should my data be structured?

Froomle expects each row in a CSV file, or each object in a JSON file to correspond to one metadata object or event. Objects or events cannot be nested.

What should a file dump contain?

Froomle suggests uploading only objects or events that have been added or changed since the last file dump. This is to keep file sizes small and avoid unnecessary delays processing information that has already been processed in previous dumps. Froomle can however process full file dumps when required. Contact us at support@froomle.com if you wish to send full file dumps.

Maximum file size

Note that maximal supported file size is 1GiB. If more data needs to be uploaded at once, it has to be split in separate files.

Data retention policy

By default, Froomle prefers to hold onto data for as long as possible in compliance with GDPR. If however you would wish to take a more pre-emptive approach, please contact us to discuss the optimal policy.

Examples

Metadata

Items

Froomle identifies items by a single column, unique identifier. All events should refer to this item by using this same unique identifier.

Often you will have different types of items that have different metadata: e.g. videos may have a structure that is different from news articles. Froomle has a notion of item types so that we can deal with these peculiarities. When in doubt whether you have different item types or not, contact us.

Example

An example of a catalog file dump (e.g. prod_catalog_2020-01-01.csv) is included below in both CSV and JSON format:

item_id,name,size,stock,price
1,"J&M T-Shirt Blue",M,23,34.90
2,"J&M Jeans Shorts",28/30,86,49.90
3,"J&M Slim Fit Jeans",32/34,12,79.90
4,"MMH Air-Performance Sneakers",EU42,45,75.00
5,"LoRemI Sweat Pants Slim Fit",36,143,24.90
6,"LoRemI Yoga Pants Straight Fit",34,56,29.90
[
  {
    "item_id": 1,
    "name": "J&M T-Shirt Blue",
    "size": "M",
    "stock": 23,
    "price": 34.90
  },
  {
    "item_id": 2,
    "name": "J&M Jeans Shorts",
    "size": "28/30",
    "stock": 86,
    "price": 49.90
  },
  {
    "item_id": 3,
    "name": "J&M Slim Fit Jeans",
    "size": "32/34",
    "stock": 12,
    "price": 79.90
  },
  {
    "item_id": 4,
    "name": "MMH Air-Performance Sneakers",
    "size": "EU42",
    "stock": 45,
    "price": 75.00
  },
  {
    "item_id": 5,
    "name": "LoRemI Sweat Pants Slim Fit",
    "size": "36",
    "stock": 143,
    "price": 24.90
  },
  {
    "item_id": 6,
    "name": "LoRemI Yoga Pants Straight Fit",
    "size": "34",
    "stock": 56,
    "price": 29.90
  }
]

Events

Froomle expects events to follow a pre-defined structure. For a primer on events, see: Events.

Typically, events are sent in real-time. However, certain types of offline events are also of interest to Froomle, e.g. offline purchases. Additionally, file dumps of historical events could help avoid cold-start problems at project setup.

Offline purchases

Offline purchases are commonly sent in a file dump. Similar to online purchases, Froomle expects offline purchases to contain an object/a row for each purchase.

Historical detail pageviews

If you would have access to historical detail pageviews, these could be of great use to Froomle to avoid cold-start problems.

Example

An example of a historic dump of events is included below in both CSV and JSON format:

event_type, device_id, user_id, action_item, channel
1,"detail_pageview","the_device_id-123","the_user_id-123","J&M T-Shirt Blue","www-desktop"
2,"detail_pageview","the_device_id-124","the_user_id-124","MMH Air-Performance Sneakers","www-desktop"
3,"detail_pageview","the_device_id-125","the_user_id-125","J&M Slim Fit Jeans","mobile-app"
4,"detail_pageview","the_device_id-126","the_user_id-126","LoRemI Yoga Pants Straight Fit","mmobile-app"
5,"detail_pageview","the_device_id-126","the_user_id-126","MMH Air-Performance Sneakers","www-desktop"
[
  {
    "event_type" : "detail_pageview",
    "device_id" : "the_device_id-123",
    "user_id" : "the_user_id-123",
    "action_item" : "J&M T-Shirt Blue",
    "channel" : "www-desktop"
  },
  {
    "event_type" : "detail_pageview",
    "device_id" : "the_device_id-124",
    "user_id" : "the_user_id-124",
    "action_item" : "MMH Air-Performance Sneakers",
    "channel" : "www-desktop"
  },
  {
    "event_type" : "detail_pageview",
    "device_id" : "the_device_id-125",
    "user_id" : "the_user_id-125",
    "action_item" : "J&M Slim Fit Jeans",
    "channel" : "mobile-app"
  },
  {
    "event_type" : "detail_pageview",
    "device_id" : "the_device_id-126",
    "user_id" : "the_user_id-126",
    "action_item" : "LoRemI Yoga Pants Straight Fit",
    "channel" : "www-mobile"
  },
  {
    "event_type" : "detail_pageview",
    "device_id" : "the_device_id-126",
    "user_id" : "the_user_id-126",
    "action_item" : "MMH Air-Performance Sneakers",
    "channel" : "www-desktop"
  }
]