This interface exposes methods required for interacting with an asset storage provider, that is, to store, retrieve, and delete digital content. In other words, this service is responsible for supporting DefaultAssetStorageType#INTERNAL
assets that Broadleaf has access and responsibility to store. It should be implemented for each type of storage provider to be used.
By default, there should only be a single implementation of this component in use at a time, meaning that there should only be one active storage provider used by the Asset service.
Note this service’s use can be restricted by a MIME type whitelist (see isWhitelistedMimeType(String)
). This means that implementors can restrict what kinds of files can be supported. This method is linked to a system property of the form broadleaf.asset.internal.storageProvider.mimeTypeWhitelist
.
The method contract supports creating new resources from either source InputStream
or File
instances.
Also note that, in the case of failures while processing multiple resources in either the addResourcesFromStreams(Map)
or deleteResources(Iterable)
methods, the default implementation will try to process as many resources as possible before returning a composite exception with all files that failed and their causes (see BulkStorageException
).
StorageProvider
Paths
Overview
Notably, the 'URLs' or 'paths' provided to StorageProvider
are expected to already be globally unique.
For example, they will look something like /tenants/{tenant-id}/my-asset.jpg
, or /applications/{application-id}/my-asset.jpg
.
The StorageProvider
component itself is not intended to carry the burden of DataTracking context discrimination or other complex processes.
It is merely intended to be a data storage mechanism, and as such, it is the responsibility of the calling component to ensure the path/URL values are unique before submitting them to the StorageProvider
.
For example, when the StorageService
is processing assets, it does the following:
-
For a newly uploaded Asset, if the Asset.url
would be a match for an existing asset in the same context, it will apply an incrementing suffix to the end of the URL (ex: /my-asset.jpg
becomes /my-asset-1.jpg
).
This ensures that at least within a particular tenant/application, the Asset.url
itself is unique.
-
Before calling the StorageProvider
, Asset.url
is prepended with the context-aware prefix (ex: /tenants/{tenant-id}
for tenant-level assets, or /applications/{application-id}
for application-level assets) based on the Asset
entity’s context.
This allows different tenants/applications to have similarly named assets without clashing.
StorageLocationMapping
Internally, a StorageProvider
implementation may not necessarily store assets exactly at the path/URL provided by the caller.
Broadleaf has a StorageLocationMapping
entity that can optionally be used by such StorageProvider
implementations to keep track of the mappings from the 'original location' (path/URL) provided by the caller to the actual 'storage provider location' that holds the binary data.
This has the additional advantage of making potential migrations of data easier to tackle, should it be necessary.
This entity is only expected to be managed/used internally by StorageProvider
implementations that need it, and is not intended to be exposed to outside callers.
It is mainly an implementation detail.
Please see the StorageLocationMappingService
/StorageLocationMappingRepository
components for performing operations on this entity.
As of 2.0.3, both the FilesystemStorageProvider
and the GoogleCloudStorageProvider
have been updated to create StorageLocationMapping
records for any newly created assets.
During asset resolution/management, the actual storage location is determined first by checking for a StorageLocationMapping
record.
If no mapping record is available, the system will fall back to the legacy (pre-2.0.3) storage location calculation.
This guarantees backward compatibility and successful resolution of previously-created assets.
Implementations
There are a few implementations of StorageProvider
that come with Broadleaf by default.
FilesystemStorageProvider
This is the default StorageProvider
implementation.
It is active when broadleaf.asset.internal.storage-provider.implementation=FILESYSTEM
, and can be configured further with properties under broadleaf.asset.internal.storage-provider.filesystem.*
, as noted in InternalAssetProperties.
With this approach, the system will access and manage assets as files under a directory path.
In a production environment, the typical pattern is to establish a shared persistent volume, and use NFS volume mounts to allow each individual container to access the data at a locally-available path.
Storage Path Calculation
In order to achieve a more efficient and balanced distribution of files in the filesystem, the FilesystemStorageProvider
will internally hash the caller-provided URL/path and use the result to calculate the actual storage path.
This evenly distributes the files and avoids having a single directory with too many files.
-
Behavior since 2.0.3
-
If the URL is /product/myproductimage.jpg
, then the MD5 would be 35ec52a8dbd8cf3e2c650495001fe55f
.
Assuming broadleaf.asset.internal.storage-provider.filesystem.max-generated-directory-depth=2
, this would result in the following file on the filesystem: {providerRootLocation}/35/ec/{RANDOM-ULID}.jpg
-
In this implementation, the filename is set to a fully random ULID value (albeit with the same extension as the original URL).
This ensures that a collision, where multiple distinct caller-provided URLs map to the same actual location, is not possible.
-
This requires the StorageLocationMapping concept to map between original/actual paths.
-
Behavior prior to 2.0.3
-
If the URL is /product/myproductimage.jpg
, then the MD5 would be 35ec52a8dbd8cf3e2c650495001fe55f
.
Assuming broadleaf.asset.internal.storage-provider.filesystem.max-generated-directory-depth=2
, this would result in the following file on the filesystem: {providerRootLocation}/35/ec/myproductimage.jpg
.
GoogleCloudStorageProvider
It is active when broadleaf.asset.internal.storage-provider.implementation=GCS
, and can be configured further with properties under broadleaf.asset.internal.storage-provider.google-cloud-storage.*
, as noted in InternalAssetProperties.
This also requires the com.google.cloud:google-cloud-storage
dependency to be available.
With this approach, the system will access and manage assets as objects in a GCS bucket.
Storage Path Calculation
-
Behavior since 2.0.3
-
If the URL is /product/myproductimage.jpg
, and broadleaf.asset.internal.storage-provider.google-cloud-storage.path-prefix-in-bucket=blc_assets
, then this would result in the following object name under the configured bucket: blc_assets/{RANDOM-ALPHANUM-PREFIX}-{RANDOM-ULID}.jpg
-
In this implementation, the filename is set to a fully random value (albeit with the same extension as the original URL).
This ensures that a collision, where multiple distinct caller-provided URLs map to the same actual location, is not possible.
The filename is a ULID prepended with random alphanumeric characters to honor Google Cloud Storage object naming best practices, particularly those around using non-sequential names.
-
This requires the StorageLocationMapping concept to map between original/actual paths.
-
Behavior prior to 2.0.3
-
If the URL is /product/myproductimage.jpg
, then the MD5 would be 35ec52a8dbd8cf3e2c650495001fe55f
.
Assuming broadleaf.asset.internal.storage-provider.google-cloud-storage.max-generated-directory-depth=2
and broadleaf.asset.internal.storage-provider.google-cloud-storage.path-prefix-in-bucket=blc_assets
, this would result in the following object name under the configured bucket: blc_assets/35/ec/myproductimage.jpg
.
-
In this implementation, the filename from the original URL is preserved in the final object name.
-
This implementation was originally just a carry-over that matched the approach from FilesystemStorageProvider
for the sake of consistency.