This interface exposes methods required for interacting with an asset storage provider, that is, to store, retrieve, and delete digital content. In other words, this service is responsible for supporting DefaultAssetStorageType#INTERNAL assets that Broadleaf has access and responsibility to store. It should be implemented for each type of storage provider to be used.
By default, there should only be a single implementation of this component in use at a time, meaning that there should only be one active storage provider used by the Asset service.
Note this service’s use can be restricted by a MIME type whitelist (see isWhitelistedMimeType(String)). This means that implementors can restrict what kinds of files can be supported. This method is linked to a system property of the form broadleaf.asset.internal.storageProvider.mimeTypeWhitelist.
The method contract supports creating new resources from either source InputStream or File instances.
Also note that, in the case of failures while processing multiple resources in either the addResourcesFromStreams(Map) or deleteResources(Iterable) methods, the default implementation will try to process as many resources as possible before returning a composite exception with all files that failed and their causes (see BulkStorageException).
StorageProvider Paths
Overview
Notably, the 'URLs' or 'paths' provided to StorageProvider are expected to already be globally unique.
For example, they will look something like /tenants/{tenant-id}/my-asset.jpg, or /applications/{application-id}/my-asset.jpg.
The StorageProvider component itself is not intended to carry the burden of DataTracking context discrimination or other complex processes.
It is merely intended to be a data storage mechanism, and as such, it is the responsibility of the calling component to ensure the path/URL values are unique before submitting them to the StorageProvider.
For example, when the StorageService is processing assets, it does the following:
-
For a newly uploaded Asset, if the Asset.url would be a match for an existing asset in the same context, it will apply an incrementing suffix to the end of the URL (ex: /my-asset.jpg becomes /my-asset-1.jpg).
This ensures that at least within a particular tenant/application, the Asset.url itself is unique.
-
Before calling the StorageProvider, Asset.url is prepended with the context-aware prefix (ex: /tenants/{tenant-id} for tenant-level assets, or /applications/{application-id} for application-level assets) based on the Asset entity’s context.
This allows different tenants/applications to have similarly named assets without clashing.
StorageLocationMapping
Internally, a StorageProvider implementation may not necessarily store assets exactly at the path/URL provided by the caller.
Broadleaf has a StorageLocationMapping entity that can optionally be used by such StorageProvider implementations to keep track of the mappings from the 'original location' (path/URL) provided by the caller to the actual 'storage provider location' that holds the binary data.
This has the additional advantage of making potential migrations of data easier to tackle, should it be necessary.
This entity is only expected to be managed/used internally by StorageProvider implementations that need it, and is not intended to be exposed to outside callers.
It is mainly an implementation detail.
Please see the StorageLocationMappingService/StorageLocationMappingRepository components for performing operations on this entity.
As of 2.0.3, both the FilesystemStorageProvider and the GoogleCloudStorageProvider have been updated to create StorageLocationMapping records for any newly created assets.
During asset resolution/management, the actual storage location is determined first by checking for a StorageLocationMapping record.
If no mapping record is available, the system will fall back to the legacy (pre-2.0.3) storage location calculation.
This guarantees backward compatibility and successful resolution of previously-created assets.
Implementations
There are a few implementations of StorageProvider that come with Broadleaf by default.
FilesystemStorageProvider
This is the default StorageProvider implementation.
It is active when broadleaf.asset.internal.storage-provider.implementation=FILESYSTEM, and can be configured further with properties under broadleaf.asset.internal.storage-provider.filesystem.*, as noted in InternalAssetProperties.
With this approach, the system will access and manage assets as files under a directory path.
In a production environment, the typical pattern is to establish a shared persistent volume, and use NFS volume mounts to allow each individual container to access the data at a locally-available path.
Storage Path Calculation
In order to achieve a more efficient and balanced distribution of files in the filesystem, the FilesystemStorageProvider will internally hash the caller-provided URL/path and use the result to calculate the actual storage path.
This evenly distributes the files and avoids having a single directory with too many files.
-
Behavior since 2.0.3
-
If the URL is /product/myproductimage.jpg, then the MD5 would be 35ec52a8dbd8cf3e2c650495001fe55f.
Assuming broadleaf.asset.internal.storage-provider.filesystem.max-generated-directory-depth=2, this would result in the following file on the filesystem: {providerRootLocation}/35/ec/{RANDOM-ULID}.jpg
-
In this implementation, the filename is set to a fully random ULID value (albeit with the same extension as the original URL).
This ensures that a collision, where multiple distinct caller-provided URLs map to the same actual location, is not possible.
-
This requires the StorageLocationMapping concept to map between original/actual paths.
-
Behavior prior to 2.0.3
-
If the URL is /product/myproductimage.jpg, then the MD5 would be 35ec52a8dbd8cf3e2c650495001fe55f.
Assuming broadleaf.asset.internal.storage-provider.filesystem.max-generated-directory-depth=2, this would result in the following file on the filesystem: {providerRootLocation}/35/ec/myproductimage.jpg.
GoogleCloudStorageProvider
It is active when broadleaf.asset.internal.storage-provider.implementation=GCS, and can be configured further with properties under broadleaf.asset.internal.storage-provider.google-cloud-storage.*, as noted in InternalAssetProperties.
This also requires the com.google.cloud:google-cloud-storage dependency to be available.
With this approach, the system will access and manage assets as objects in a GCS bucket.
Storage Path Calculation
-
Behavior since 2.0.3
-
If the URL is /product/myproductimage.jpg, and broadleaf.asset.internal.storage-provider.google-cloud-storage.path-prefix-in-bucket=blc_assets, then this would result in the following object name under the configured bucket: blc_assets/{RANDOM-ALPHANUM-PREFIX}-{RANDOM-ULID}.jpg
-
In this implementation, the filename is set to a fully random value (albeit with the same extension as the original URL).
This ensures that a collision, where multiple distinct caller-provided URLs map to the same actual location, is not possible.
The filename is a ULID prepended with random alphanumeric characters to honor Google Cloud Storage object naming best practices, particularly those around using non-sequential names.
-
This requires the StorageLocationMapping concept to map between original/actual paths.
-
Behavior prior to 2.0.3
-
If the URL is /product/myproductimage.jpg, then the MD5 would be 35ec52a8dbd8cf3e2c650495001fe55f.
Assuming broadleaf.asset.internal.storage-provider.google-cloud-storage.max-generated-directory-depth=2 and broadleaf.asset.internal.storage-provider.google-cloud-storage.path-prefix-in-bucket=blc_assets, this would result in the following object name under the configured bucket: blc_assets/35/ec/myproductimage.jpg.
-
In this implementation, the filename from the original URL is preserved in the final object name.
-
This implementation was originally just a carry-over that matched the approach from FilesystemStorageProvider for the sake of consistency.