Odalic server distinguishes between two kinds of input CSV files: local and remote ones. Local files gets to be uploaded by the user and reside in the server storage. Remote ones are defined only by their URL. Both kinds share the same structure for meta-data (differing only in their cached flag), and even the local ones have their URL assigned, which is the same under which they can be GET from the REST API. The files are parsed only moment before the execution, so it is possible to get different results if the underlying remote files change or the parsing format of both the local and remote file is changed by the user. Every input file can be shared among multiple processing tasks belonging to the same user, therefore it cannot be deleted as long as at least one task refers to it. The references are kept by the implementing FileService, where for each referring Task has to be subscribed and unsubscribed upon deletion.
Parsing is done through Apache Commons CSV library, which is able to detect line separators, but provides no mean to obtain that information for further use. So detecting code was added to the parser, because the used line separators are needed in order to export the results in the form expected by the client. The result of parsing is a model of CSV file represented by cz.cuni.mff.xrg.odalic.input.Input
, consisting of the the list of rows and header (first row). Odalic assumes the file to be consistent, that means containing equal number of records in all rows, and raise an exception if this is not the case.