Thank you Sir, I am currently developing a small OLTP web application using Spring Framework.Although Spring Framework is open source it is actually a professional product which comes a professional code generator at https://start.spring.io/.The code generator is flawless and professional like yourself.
I am using the following two Java Libraries to ingest (fetch) data across the Wide Area Network for processing.These Java libraries only became available recently ( jdk12). import java.net.URI; import java.net.http.HttpClient; import java.net.http.HttpRequest; import java.net.http.HttpResponse; // declare temp store to prevent errors by calling only after population process complete. List<LocationStats> newStats = new ArrayList<>(); // create a new Http client new features in JDK 12+ HttpClient client = HttpClient.newHttpClient(); // create request with the URL using builder pattern HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(VIRUS_DATA_URL)) .build(); // send request and body of the response as a String HttpResponse<String> httpResponse = client.send(request,HttpResponse.BodyHandlers.ofString()); // System.out.println(httpResponse.body()); I am also using Java Libraries http://commons.apache.org/proper/commons-csv/user-guide.html to process the raw data. ready for display in browser. // read whole csv file StringReader csvBodyReader = new StringReader(httpResponse.body()); // populate array with each row marking first row as table header Iterable<CSVRecord> records = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(csvBodyReader); for (CSVRecord record : records) { LocationStats locationStat = new LocationStats(); locationStat.setState(record.get("Province/State")); locationStat.setCountry(record.get("Country/Region")); int latestCases = Integer.parseInt(record.get(record.size() - 1)); locationStat.setLatestTotalCases(latestCases); newStats.add(locationStat); System.out.println(locationStat); Thank you once again sir for clarifying WEKA and its scope of use case. jane thorpe janethor...@aol.com -----Original Message----- From: Teemu Heikkilä <te...@emblica.fi.INVALID> To: jane thorpe <janethor...@aol.com.INVALID> CC: user <user@spark.apache.org> Sent: Sun, 12 Apr 2020 22:33 Subject: Re: covid 19 Data [DISCUSSION] Hi Jane! The data you pointed there is couple tens of MBs, I wouldn’t exacly say it’s "big data” and definitely you don’t need to use Apache Spark for processing that amount of data. I would suggest you using some other tools for your processing needs. WEKA is ”full suite” for data analysis and visualisation and it’s probably good choice for the task. If you want to go lower level like with Spark and you are familiar with Python, pandas could be good library to investigate. br,Teemu Heikkilä te...@emblica.com +358 40 0963509 Emblica ı The data engineering company Kaisaniemenkatu 1 B 00100 Helsinki https://emblica.com jane thorpe <janethor...@aol.com.INVALID> kirjoitti 12.4.2020 kello 22.30: Hi, Three weeks a phD guy proposed to start a project to use Apache Spark to help the WHO with predictive analysis using COVID -19 data. I have located the daily updated data. It can be found here https://github.com/CSSEGISandData/COVID-19. I was wondering if Apache Spark is up to the job of handling BIG DATA of this sizeor would it be better to use WEKA. Please discuss which product is more suitable ? Jane janethor...@aol.com