BDIA file format
The Bulk Data Insertion API ingests data into Adobe Analytics using batch files. These files are in a specific CSV format where each row of the file contains details of a server call. Each row, or server call, must specify an identifier for a visitor as well as a timestamp for when the interaction occurred. The server calls must be ordered chronologically by their timestamps, from earliest to latest, in the batch files. Each batch file must also be compressed.
Adobe may add optional request and response members (name/value pairs) to existing API objects at any time and without notice or changes in versioning. Adobe recommends that you refer to the API documentation of any third-party tool you integrate with our APIs so that such additions are ignored in processing if not understood. If implemented properly, such additions are non-breaking changes for your implementation. Adobe will not remove parameters or add required parameters without first providing standard notification through release notes.
Batch file requirements
Batch files must conform to all of the following requirements:
- The file format is in CSV, conforming to the RFC-4180 standard with one exception; empty lines are ignored.
- Every file consists of a header row (the first row in the file) and subsequent data rows.
- Header columns and fields are delimited by commas. If you have commas in values, surround the value in double quotes (
"
). If you also have double quotes in values, use double quotes inside the value. For example,field1,"Value with ""quotes"", and a comma.",field3
- The value that appears in reporting would beValue with "quotes", and a comma.
- Every row must have the same number of columns as the header row. If you want to omit a column from a row, leave the field empty or pass an empty string. For example,
field1,,field3
orfield1,"",field3
. - Trailing commas for header rows or data rows are not permitted.
- All rows in a batch file for any given visitor must be sorted in chronological order by
timestamp
from earliest to latest. Following this rule is crucial for attribution and analyzing visitor behavior. Adobe does not guarantee the integrity of data processed by this API if this rule is not strictly observed. - All batch files must be compressed using gzip compression.
- Compressed file sizes are limited to 100 MB. Uncompressed file sizes are limited to 1 GB.
Batch files are flexible in the following ways:
- There are no restrictions on file names. When you submit a file to this API, Adobe returns a
file_id
that you can use to track the file. The name of the file is recorded underupload_name
in the response object as well. - Adobe supports both
CRLF
andLF
line breaks to separate rows. A line break at the end of a data file is optional. - Column header names are not case sensitive (with one exception for customerIDType, see Use customer ID to identify visitors).
- Columns can appear in any order.
- Key/value pairs in the
QueryString
field are also valid in any order.
Required columns
Every row must contain the following five data points. If a row misses any one of these requirements, that row is skipped.
- At least one of:
visitorID
marketingCloudVisitorID
IPAddress
customerID.[customerIDType].id
withcustomerID.[customerIDType].isMCSeed
set to1
. See Use customer ID to identify visitors.
- At least one of:
pageURL
pageName
linkType
withlinkName
orlinkURL
queryString
that includespageURL
,pageName
, orlinkType
as query string parameters with values
reportSuiteID
timestamp
userAgent
Adobe only uses one visitor ID for a given row. If more than one visitor ID column contains data, Adobe uses the following priority to identify that visitor:
customerID.[customerIDType].id
withcustomerID.[customerIDType].isMCSeed
set to1
visitorID
marketingCloudVisitorID
IPAddress
Query string or column-based row
Adobe offers two ways to populate rows with data.
- Use column headers: Use a separate column for each variable.
- Use the
queryString
column: Include most data in thequeryString
column. This method is particularly helpful for implementations that use data generated by AppMeasurement libraries. You can include the image request's entire query string in this column with minimal adjustments. Other columns, such astimestamp
andreportSuiteID
, are not included inqueryString
and are still required as separate columns.
You can combine both of these methods in any amount to fill out rows with data. If a variable is present as both a query string and its column header, the column header value takes priority. For example, if the pageName
column is "Column header example"
and the queryString
column contains "pageName=Query string example"
, the value that Adobe uses is "Column header example"
.
CSV and query string column reference
Adobe supports the following columns in batch files.
Column header name | queryString equivalent | Description |
---|---|---|
aamlh | aamlh | Integer that represents the Adobe Audience Manager location hint. Valid values include: 3 : Hong Kong/Singapore (apse.demdex.net )6 : Amsterdam/London (irl1.demdex.net )7 : US Central/East (use.demdex.net )8 : Australia (apse2.demdex.net )9 : US West (usw2.demdex.net )11 : Tokyo (tyo3.demdex.net ) |
browserHeight | bh | The Browser height dimension. |
browserWidth | bw | The Browser width dimension. |
campaign | v0 | The Tracking code dimension. |
channel | ch | The Site section dimension. |
colorDepth | c | The Color depth dimension. |
connectionType | ct | The Connection type dimension. |
contextData.key | c.[key] | contextData implementation variables. |
cookiesEnabled | k | The Cookie support dimension. |
currencyCode | cc | The currencyCode implementation variable. |
customerID. [customerIDType]. id | cid.[customerIDType].id | The id used in the Experience Cloud Identity Service setCustomerIDs method. |
customerID. [customerIDType]. authState | cid.[customerIDType].as | The authState used in the Experience Cloud Identity Service setCustomerIDs method. String values are not case sensitive. Supported values are:0 or UNKNOWN or an empty string: Not logged in1 or AUTHENTICATED : Logged in2 or LOGGED_OUT : Logged out |
customerID. [customerIDType]. isMCSeed | cid.[customerIDType].ismcseed | An integer boolean that lets you use customerID.[customerIDType].id as the hit's identifier. Use 1 for true and 0 for false. See Use customer ID to identify visitors. |
eVar1 - eVar250 | v1 - v250 | eVar dimensions. |
events | events | The events implementation variable. |
hier1 - hier5 | h1 - h5 | |
hints.architecture | h.architecture | Client Hints: The underlying architecture for the device |
hints.bitness | h.bitness | Client Hints: "bitness" of the user-agent's CPU architecture — typically 64 or 32 |
hints.brands | h.brands | Client Hints: List of browser brands and their significant version, formatted as a serialized JSON object array: [{"brand":"Chromium","version":"104"}, {"brand":"Google Chrome","version":"104"}] |
hints.mobile | h.mobile | Client Hints: Boolean indicating if the browser is on a mobile device |
hints.model | h.model | Client Hints: The device model |
hints.platform | h.platform | Client Hints: The platform for the device, usually the operating system (OS) |
hints.platformversion | h.platformversion | Client Hints: The version for the platform or OS |
hints.wow64 | h.wow64 | Client Hints: Boolean indicating if a 32-bit user-agent application is running on a 64-bit Windows machine |
ipaddress | N/A (Only available with column header) | The visitor's IP address. |
javaEnabled | v | The Java enabled dimension. |
language | N/A (Only available with column header) | The Language dimension. |
linkName | pev2 | The Download link, Exit link, or Custom link dimension, depending on the value in the linkType column. If this column contains a value, pageName is ignored. |
linkType | pe | The type of link. Defaults to o if this field is empty and linkName contains a value. Valid values when using the linkType column include:d : Download linke : Exit linko : Custom linkWhen using the pe query string, use:lnk_d : Download linklnk_e : Exit linklnk_o : Custom link |
linkURL | pev1 | The link URL. |
list1 - list3 | l1 - l3 | |
marketingCloudVisitorID | mid | The unique identifier used with the Adobe Experience Cloud Identity Servce. |
pageName | pageName | The Page dimension. |
pageType | pageType | The pageType implementation variable. Set to the string value "errorPage" on any error pages, such as a 404 or 503 error. |
pageURL | g | The Page URL dimension. |
products | products | The products implementation variable. |
prop1 - prop75 | c1 - c75 | Prop dimensions. |
purchaseID | purchaseID | The purchaseID implementation variable. |
queryString | This column provides information for this field. | Key/value pairs that provide an alternative to using header columns. This column must be fully URL encoded, including any multi-byte characters. Adobe encodes the query string in UTF-8 by default. |
referrer | r | The referrer implementation variable. |
reportSuiteID | N/A (Only available with column header) | Specifies the report suite(s) where you want to submit data. Separate multiple report suite IDs with a comma. |
resolution | s | The Monitor resolution dimension. |
server | server | The Server dimension. |
timestamp | ts | |
tnta | tnta | Target data payload. Used with Analytics for Target integrations. |
trackingServer | N/A (Only available with column header) | The trackingServer implementation variable. |
transactionID | xact | The transactionID variable. |
userAgent | N/A (Only available with column header) | The device's user agent string. |
visitorID | vid | The visitorID implementation variable. |
zip | zip | The Zip code dimension. |
The above table are the only column headers that Adobe supports. If you upload a file with a column header that is not included in the above table, that column is ignored.
Batch file examples
The following text blocks are examples of what a CSV file looks like with a small number of rows and columns. Both examples contain a header row with two rows of data.
Batch file using the querystring
column
Copied to your clipboardtimestamp,visitorid,reportsuiteid,querystring,useragent1492191617,44444445,examplersid,pageName=PIGINI&v2=Var21&v3=Var31&c1=val11&c2=val21&c3=val31&bh=1000&bw=999&c=1024&j=3.41&k=1&p=1&s=1111&v=1&channel=TestChannel&pev1=https%3A%2F%2Fwww.adobe.com%2Fwho%3Fq%3Dwhoisit&state=UT&zip=84005&cc=USD&events=prodView%2Cevent2,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.1 Safari/537.36"1492191627,44444445,examplersid,pageName=PIGINI&v2=Var22&v3=Var32&c1=val12&c2=val22&c3=val32&bh=1000&bw=999&c=1024&j=3.41&k=1&p=1&s=1111&v=1&channel=TestChannel&pev1=https%3A%2F%2Fwww.adobe.com%2Fwho%3Fq%3Dwhoisit&state=UT&zip=84005&cc=USD&events=prodView%2Cevent2,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.1 Safari/537.36"
Batch file using column headers
Copied to your clipboardpageName,timestamp,reportSuiteID,visitorID,userAgent,campaign,contextData.color,contextData.frame,pageURL,prop1,channel中文网站,1495483797,examplersid,238915514,"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1 ""SpecialBuild""",Summer,Red,Titanium,http://example.com/path?param=val¶m2=val2,p2,Mobile中文网站,1495483797,examplersid,142805255,"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1 ""SpecialBuild""",Summer,Gray,Carbon,http://example.com/path?param=val¶m2=val2,p2,Mobile
Once you have a correctly formatted file, you can start sending calls to the available Endpoints.