Thank you Sir,
I am currently developing a small  OLTP web application using  Spring 
Framework.Although Spring Framework  is open source it is actually a 
professional product which comes a professional code generator  at 
https://start.spring.io/.The code  generator is flawless and professional like 
yourself.

I am using the following two Java Libraries to ingest (fetch) data across the 
Wide Area Network for processing.These Java libraries only became available 
recently ( jdk12). 

import java.net.URI;

import java.net.http.HttpClient;

import java.net.http.HttpRequest;

import java.net.http.HttpResponse;


// declare temp store to prevent errors by calling only after population 
process complete.

List<LocationStats> newStats = new ArrayList<>();



// create a new Http client new features in JDK 12+

HttpClient client = HttpClient.newHttpClient();

// create request with the URL using builder pattern

HttpRequest request = HttpRequest.newBuilder()

        .uri(URI.create(VIRUS_DATA_URL))

        .build();



    // send request and body of the response as a String

    HttpResponse<String> httpResponse = 
client.send(request,HttpResponse.BodyHandlers.ofString());

    // System.out.println(httpResponse.body());


I am also using Java Libraries 
http://commons.apache.org/proper/commons-csv/user-guide.html to process the raw 
data. ready for display in browser. 
    // read whole csv file

    StringReader csvBodyReader = new StringReader(httpResponse.body());



    // populate array with each row  marking first row as table header

Iterable<CSVRecord> records = 
CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(csvBodyReader);



for (CSVRecord record : records) {



        LocationStats locationStat = new LocationStats();

        locationStat.setState(record.get("Province/State"));

        locationStat.setCountry(record.get("Country/Region"));



        int latestCases = Integer.parseInt(record.get(record.size() - 1));

        locationStat.setLatestTotalCases(latestCases);



        newStats.add(locationStat);



    System.out.println(locationStat);
Thank you once again sir for clarifying  WEKA and its scope of use case.
  
jane thorpe
janethor...@aol.com
 
 
-----Original Message-----
From: Teemu Heikkilä <te...@emblica.fi.INVALID>
To: jane thorpe <janethor...@aol.com.INVALID>
CC: user <user@spark.apache.org>
Sent: Sun, 12 Apr 2020 22:33
Subject: Re: covid 19 Data [DISCUSSION]

Hi Jane!
The data you pointed there is couple tens of MBs, I wouldn’t exacly say it’s 
"big data” and definitely you don’t need to use Apache Spark for processing 
that amount of data. I would suggest you using some other tools for your 
processing needs. 
WEKA is ”full suite” for data analysis and visualisation and it’s probably good 
choice for the task. If you want to go lower level like with Spark and you are 
familiar with Python, pandas could be good library to investigate. 
br,Teemu Heikkilä

te...@emblica.com 
+358 40 0963509

Emblica ı The data engineering company
Kaisaniemenkatu 1 B
00100 Helsinki
https://emblica.com

jane thorpe <janethor...@aol.com.INVALID> kirjoitti 12.4.2020 kello 22.30:
 Hi,
Three weeks a phD guy proposed to start a project  to use Apache Spark 
to help the WHO with predictive analysis  using COVID -19 data.

I have located the daily updated data. 
It can be found here 
https://github.com/CSSEGISandData/COVID-19.
I was wondering if Apache Spark is up to the job of handling BIG DATA of this  
sizeor would it be better to use WEKA.
Please discuss which product is more suitable ?

 
Jane 
janethor...@aol.com


Reply via email to