Update 16 Nov 2016:
Research of performance of cloud synchronization services l…ike ownCloud, Seafile and Dropbox
has shown, that on-premise services show better performance characteristics than public clouds
syncing big files (higher transfer rates in both upload and download could be obtained due to simple
implementation and smaller activity of users for specific bandwidth) and are very competitive
syncing mixtures of files.
Unlike typical web services, cloud sync and share is characterized by requests load/number much outreaching the typical loads to the web server per user in some specific activity type.
Underutilized upload/download bandwidth and long distribution tails (penalizing transfers
of small files over WAN) are characteristic for services using current ownCloud synchronization
protocol. Important factor in synchronization performance is also number of operations performed
per single-file request on the web-server. Along with http/2 extensions - which will be addressed separately - this feature should reduce the impact of latency and significantly lower number of requests to the server, making server more lightweight, utilize pipe better and in turn sync files faster.
With 0 latency, having sync on local machine, the following scenario has been under the test (in this scenario latency/locality is much favourising traditional http/1 puts):
https://s3.owncloud.com/owncloud/index.php/s/kSVQvr3y7EdmZ6b?path=%2F
1000f of 1kB and 100f of 100kB - total 11MB.
Number of requests has been reduced from 1100 to 15 requests (typical number for web content)
Sync time on the test machine has been reduced, on average, from 37s to 31s (taking also into account recent sync performance improvement for single puts in the folder, which bundling improved out of the box from concept https://github.com/owncloud/client/pull/5230, https://github.com/owncloud/client/pull/5274)
This is profile for 1 ( ONE! ) PUT of 22kB file using standard http/1:

This is profile for 1 ( ONE! ) Bundle containing 10 (TEN) 22kB files - total 220kB using standard http/1:

Bundled request requirements
> 1. A bundle Request receives a 207 Multi-Status Response with the individual 20x, 30x, 40x, 50x statuses for each file. It receives a 400 Bad Request response with an error message if the Request was malformed.
>
> 2. Request body can be any mime-type, with full implementation freedom.
>
> 3. Request is finishing with delivering last part of successfull response after all linked operations has been successfuly finished, or aborted immedietaly in case of request cancelation/termination.
>
> 4. If request cannot be executed or response cannot be correctly constructed, request has to be aborted and error 4xx-5xx has to be returned for whole the request.
We already had implemented both prototypes for multipart/mixed and multipart/related, discussed it a lot and tried out:
I found following limitations for each of the request, starting with the order of implementation:
Multipart/related:
- This type of mime type includes in the first part the list of files to be created, in the key->value manner, where key is the path and value is metadata for that file. Response is created based on the keys in the metadata part, and files are reconstructed from binary contents in the request body, referenced by Content-ID
- This mime-type allows to to easily return response for the file, because key-value structure and validation at the begining allows you to correctly construct the response for each of the keys-files(even if content-id is missing, you can return the response for the specific file that binary content for that file has not been found) or return parsing error at the begining.
- This is high performance solution, where files are being added to OC as they are being read from the request body and allows you to use chunked transfer encoding for the response while files are being added.
- Disadvantage is that list of files is specified in the first part, and the binary contents are anynomous reading the request without first part.
- Disadvantage is that, in case of parsing error, request has occupied bandwidth for nothing - this should however not happen in practice.
Multipart/mixed:
- This mime types includes in each part headers (metadata for a file), and in the part body, the actual file body.
- As in multipart/related, we can use chunked transfer encoding for the response.
- Advantage here is that reading pure request, we have an independent parts in the request.
- This mime-type requires you to parse the request body on fly in order to read the body of the request and simultaneously add it to OC. In order to correctly construct the response for each of the files, each part has to be parsed and validated, since it serves as a container for the file.
- In this mime-type, in order to validate the request at the beginning, one would need to parse the request body and save it in-memory or on the disk. The other option is to seek in the request body, however this is dangerous and unpredictable. This results in the fact, that the bundle has to occupy in the peak moment the memory equal to chunk size - typicaly 10MB for request itself and 10MB for in-memory storage, and will decrease with each file added to OC. This gives 20MB per request, and 20GB per 1000 simultaneous request in peak moment.
- In this mime-type in one need to validate and parse independent parts on fly. In case of single part parsing error, one would need to raise the full request error, since the construction for that file might not be possible due to lacking URL header for that file, which will result in lack of the response of the previous files already added in the OC for that bundle. Having half of the files acknowledged for that request, and afterwards return error for the whole request invalidates the architectural concept for request in OC (refer to point 3 and 4).
- Disadvantage is that, in case of parsing error, request has occupied bandwidth for nothing - this should however not happen in practice.
The implementation will follow the multipart/related in this version.
- [x] Bundle feature passes prove of concept tests
- [x] Integrated error handling
- [ ] Full unit tests coverage
- [ ] Documentation