Broadleaf Microservices
  • v1.0.0-latest-prod

Product Export

Overview

The Catalog Service provides out-of-box product export facilities that can produce a flat file such as a CSV. This can be triggered through the Product Export Endpoint.

After the export is initiated, an event will be produced on the processExportRequestOutput message channel that can then be consumed by on the processExportRequestInput channel. ProcessExportRequestListener typically handles consuming these messages and then deferring processing the requests to the any configured ExportProcessors. In the case of a product export, this would be the ProductExportProcessor.

ProductExportProcessor

The ProductExportProcessor handles converting a Product into a flat file representation. It reads all the entities to be converted then defers the actual conversion into maps to ProductExportRowProducer. Each row is converted by an export row converter.

Export Row Converters

The export row converters handle converting a single entity instance into a map structure that represents a single row in a flat file. The mapping of entity fields to file headers are defined by ExportSpecifications. Thus, for each entity that should be included in the flat file should have both an ExportSpecification and an export row converter.

Note

There are several out-of-the-box converters that have been implemented in order to transform a Java POJO into a Map<String, String> that can be written as a row for a file. These are all found in the com.broadleafcommerce.catalog.dataexport.converter package, and they implement Converter<[POJO type], Map<String, String>>.

The out-of-box converters are (with respective specifications):

  • AttributeChoiceValueExportRowConverter

  • CategoryProductExportRowConverter

  • DimensionsExportRowConverter

  • IncludedProductExportRowConverter

  • ProductAssetExportRowConverter

  • ProductExportRowConverter

  • ProductOptionExportRowConverter

  • SpecificItemChoiceExportRowConverter

  • VariantExportRowConverter

  • WeightExportRowConverter

Implementation Details

Special Cases

While most fields follow the format of single-cell/single-value, there are certain "embedded collection" fields which are represented as a multi-valued cell. Generally speaking, multi-valued cells contain a string that is separated by special delimiters that define the boundaries between each collection element.

Attributes

Product attributes are an example of a special-case field that is exported into a multi-valued cell. Out of the box, the ProductExportRowConverter will export the entire Map<String, Attribute> map into a single cell. Each key/value pair is separated by a :, and each entry is separated by a |. For example, a simple output could be this: attrKey1:attrVal1|attrKey2:attrVal2.

If the value inside the Attribute is a Collection or an Array, the elements will be joined together with , as the delimiter. For example, this could be a potential output: singularAttr:singularVal|collectionAttr:collectionElem1,collectionElem2,collectionElem3|arrayAttr:arrayElem1,arrayElem2,arrayElem3.

Note

Out-of-the-box, only these two "multi-element" types are properly supported. Anything other than Collection and Array is not supported. Customizing this behavior is quite simple. You may extend ProductExportRowConverter and override the getMultiValRepresentationOfAttributes() method to change how the transformation occurs.

Tip
If you plan to import files that you export, the import logic should also be adjusted to support parsing your custom syntax/format.

Performance

We have done some performance testing on the product export process.

  • These tests have been performed on a 2017 15" MacBook Pro with 16GB RAM and a 2.8GHz Intel Core i7 with 4 cores.

  • Each product in the trials had 20 variants, 4 category-product relationships, and 10 product assets (34 total dependents per product).

  • The batch size used in all trials was 100, meaning 100 products would be processed at a time.

Individual export

Summary

For a 500 product export run via Docker, we found that the average time from 5 export attempts was 30.8 seconds. If run directly from the commandline via java -jar, the average time from 5 export attempts was 39.2 seconds.

Detailed Results

Varying numbers of products
Note
These trials were run with the debugger active.
Table 1. Trials with Varying Product Counts (run with debugger)
Name Number of Products Number of Variants per Product Number of Category Products per Product Number of ProductAssets per Product Time Taken Heap Usage (ignore before dashed vertical line) Note

Export with 20 Products

20

20

4

10

3 seconds

ExportOf20ProductsHeapUsage

Data insert of the items occurred at the beginning of this run.

Export with 100 Products

100

20

4

10

13 seconds

ExportOf100ProductsHeapUsage

Data insert of the items occurred at the beginning of this run.

Export with 500 Products Attempt 1

500

20

4

10

84 seconds

ExportOf500ProductsAttempt1HeapUsage

Data insert of the items occurred at the beginning of this run.

Export with 500 Products Attempt 2

500

20

4

10

81 seconds

ExportOf500ProductsAttempt2HeapUsage

Uses the data already inserted in the first 500 product run.

Export with 500 Products Attempt 3 (ran outside of debugger)

500

20

4

10

65 seconds

ExportOf500ProductsAttempt3HeapUsage

Uses the data already inserted in the first 500 product run. The dashed line in this graph is also different, as it indicates when the export finished, not when it started.

500 product export
Note
These trials were run without any debugger active, and were all executed on the exact same data set.
Table 2. Individual 500 Product Exports (run without debugger)
Name Number of Products Number of Variants per Product Number of Category Products per Product Number of ProductAssets per Product Time Taken Note

Export of 500 Products, Run in Docker, Attempt 1

500

20

4

10

47 seconds

Attempts 1-5 of "run in docker" were all run back to back.

Export of 500 Products, Run in Docker, Attempt 2

500

20

4

10

30 seconds

Attempts 1-5 of "run in docker" were all run back to back.

Export of 500 Products, Run in Docker, Attempt 3

500

20

4

10

25 seconds

Attempts 1-5 of "run in docker" were all run back to back.

Export of 500 Products, Run in Docker, Attempt 4

500

20

4

10

27 seconds

Attempts 1-5 of "run in docker" were all run back to back.

Export of 500 Products, Run in Docker, Attempt 5

500

20

4

10

25 seconds

Attempts 1-5 of "run in docker" were all run back to back.

Export of 500 Products, Run with "java -jar" from the commandline, Attempt 1

500

20

4

10

45 seconds

Attempts 1-5 of "run with java -jar" were all run back to back.

Export of 500 Products, Run with "java -jar" from the commandline, Attempt 2

500

20

4

10

38 seconds

Attempts 1-5 of "run with java -jar" were all run back to back.

Export of 500 Products, Run with "java -jar" from the commandline, Attempt 3

500

20

4

10

41 seconds

Attempts 1-5 of "run with java -jar" were all run back to back.

Export of 500 Products, Run with "java -jar" from the commandline, Attempt 4

500

20

4

10

36 seconds

Attempts 1-5 of "run with java -jar" were all run back to back.

Export of 500 Products, Run with "java -jar" from the commandline, Attempt 5

500

20

4

10

36 seconds

Attempts 1-5 of "run with java -jar" were all run back to back.

Concurrent exports

We ran some tests where 10 exports of 500 products were initiated at the same time. The variable in these trials was the spring.cloud.stream.bindings.process-export-request-input.consumer.concurrency property, which defined how many exports were processed simultaneously.

Note
These trials were run without any debugger active, and were all executed on the exact same data set.
Note
The MaxHeapSize limit was set to -Xmx1536m for all trials.

Summary

The optimal concurrency setting appears to be 5.

With concurrency of 5, all 10 exports completed in only 72% of the time taken by a concurrency of 2 when run in docker, at the cost of 25% more peak memory usage. When run directly from the commandline, the process completed in only 62% of the time taken by the concurrency of 2, at the cost of 27% more peak memory usage.

Increasing the concurrency to 10 had diminishing returns. All 10 exports completed in 98% (in docker) and 90% (directly from the commandline) of the time taken by a concurrency of 5. This was at the cost of 20% and 9% more memory usage, respectively.

Detailed Results

Table 3. Concurrent 500 Product Exports (run without debugger)
Trial Name Total Time Taken GCEasy Report JVM memory size Heap Usage Number of Products per Export Number of Variants per Product per Export Number of Category Products per Product per Export Number of ProductAssets per Product per Export

10 Exports Run in Docker, Concurrency of 2

184 seconds

Link

10ExportsRunInDockerConcurrency2JVMMem

10ExportsRunInDockerConcurrency2HeapUsage

500

20

4

10

10 Exports Run in Docker, Concurrency of 5

133 seconds

Link

10ExportsRunInDockerConcurrency5JVMMem

10ExportsRunInDockerConcurrency5HeapUsage

500

20

4

10

10 Exports Run in Docker, Concurrency of 10

131 seconds

Link

10ExportsRunInDockerConcurrency10JVMMem

10ExportsRunInDockerConcurrency10HeapUsage

500

20

4

10

10 Exports Run with "java -jar" from the commandline, Concurrency of 2

251 seconds

Link

10ExportsRunWithJavaJarConcurrency2JVMMem

10ExportsRunWithJavaJarConcurrency2HeapUsage

500

20

4

10

10 Exports Run with "java -jar" from the commandline, Concurrency of 5

155 seconds

Link

10ExportsRunWithJavaJarConcurrency5JVMMem

10ExportsRunWithJavaJarConcurrency5HeapUsage

500

20

4

10

10 Exports Run with "java -jar" from the commandline, Concurrency of 10

140 seconds

Link

10ExportsRunWithJavaJarConcurrency10JVMMem

10ExportsRunWithJavaJarConcurrency10HeapUsage

500

20

4

10