Ebulk tool makes easy to exchange or archive very large data sets. It performs data set ingestions or downloads from different protocols, to Wendelin-IA platform. It also allows to perform local changes in data sets and to upload added and modified files. One key feature of Ebulk is to be able to resume and recover from errors happening with interrupted transfers. <ahref="erp5/web_site_module/fif_data_runner/#/?page=ebulk_doc">See documentation</a>
Ebulk tool makes easy to exchange or archive very large data sets. It performs data set ingestions or downloads from different protocols, to Wendelin-IA platform. It also allows to perform local changes in data sets and to upload added and modified files. One key feature of Ebulk is to be able to resume and recover from errors happening with interrupted transfers. <b><ahref="erp5/web_site_module/fif_data_runner/#/?page=ebulk_doc"style="color:#FF9D6C">See documentation</a></b>.
<pclass="last">Ebulk tool makes easy to exchange or archive very large data sets. It performs data set ingestion or download from different storage inputs, to Wendelin-IA platform (based on stack <ahref="https://wendelin.nexedi.com/">Wendelin</a> - <ahref="https://neo.nexedi.com/">NEO</a> - <ahref="https://erp5.nexedi.com/">ERP5</a>). It also allows to perform local changes in data sets and to upload the added and modified files. One key feature of Ebulk is to be able to resume and recover from errors happening with interrupted transfers.</p>
<pclass="last">Ebulk tool makes easy to exchange or archive very large data sets. It performs data set ingestion or download from different storage inputs, to Wendelin-IA platform (based on stack <atarget="_blank"href="https://wendelin.nexedi.com/">Wendelin</a> - <atarget="_blank"href="https://neo.nexedi.com/">NEO</a> - <atarget="_blank"href="https://erp5.nexedi.com/">ERP5</a>). It also allows to perform local changes in data sets and to upload the added and modified files. One key feature of Ebulk is to be able to resume and recover from errors happening with interrupted transfers.</p>
<h1>REQUIREMENTS</h1>
<pclass="last">Java 8: Ebulk relies on Embulk-v0.9.7 bulk data loader Java application (please see <ahref="http://www.embulk.org/">Embulk-doc</a>), so Java 8 is required in order to install Ebulk tool.</p>
<pclass="last">Java 8: Ebulk relies on Embulk-v0.9.7 bulk data loader Java application (please see <atarget="_blank"href="http://www.embulk.org/">Embulk-doc</a>), so Java 8 is required in order to install Ebulk tool.</p>
<h1>Ebulk + Wendelin = Big Data sharing platform</h1>
<p><ahref="erp5/web_site_module/fif_data_runner/#/?page=ebulk_doc">Ebulk</a> tool and <atarget="_blank"href="https://wendelin.nexedi.com/">Wendelin</a> platform are combined to form an easy to use Data Lake to share petabytes of data grouped into data sets. Big Data sharing is essential for research and startups, due building new A.I. models requires access to large data sets, usually available in big platforms such as Google or Alibaba which tend to keep them secret. This project offers a solution to the big data sharing problem by solving the following key points:</p>
<ul>
<li>Huge transfer (over slow and unreliable network)</li>
<p><ahref="erp5/web_site_module/fif_data_runner/#/?page=contact">Contact us to get a full user</a></p>
</div>
</div>
<h1>Data lake</h1>
<p>Dozens of public and private big data sets are available in the platform, terabytes of data of any kind, including binaries like medical images, ndarrays and more. Do you want to download data sets or share your data? <ahref="erp5/web_site_module/fif_data_runner/#/?page=download">Download</a> our Ebulk tool to transfer big data! Please <ahref="erp5/web_site_module/fif_data_runner/#/?page=about">contact us</a> to register and get a user. See our full <ahref="erp5/web_site_module/fif_data_runner/#/?page=fifdata">data set list</a> !</p>
<p>Dozens of public and private big data sets are available in the platform, terabytes of data of any kind, including binaries like medical images, ndarrays and more. Do you want to download data sets or share your data? <ahref="erp5/web_site_module/fif_data_runner/#/?page=download">Download</a> our Ebulk tool to transfer big data! Please <ahref="erp5/web_site_module/fif_data_runner/#/?page=contact">contact us</a> to register and get a user. See our full <ahref="erp5/web_site_module/fif_data_runner/#/?page=fifdata">data set list</a> !</p>
<h1>Ebulk tool</h1>
<p>Ebulk tool is a wrapper for <atarget="_blank"href="http://www.embulk.org/docs/">Embulk</a>, an open-source bulk data loader that helps data transfer between various databases, storages, file formats, and cloud services. It supports any kind of input file formats, parallel and distributed execution to deal with big data sets, transaction control to guarantee All-or-Nothing file transfer, and operation resuming. Ebulk is as easy as git to use, allowing the big data transfering to be done by using very few commands. Please, <ahref="erp5/web_site_module/fif_data_runner/#/?page=download">download</a> Ebulk and check the <ahref="erp5/web_site_module/fif_data_runner/#/?page=ebulk_doc">documentation</a>.</p>