You seem to be implying the error is intermittent.  

You seem to be implying data is being ingested  via JDBC. So the connection has 
proven itself to be working unless no data is arriving from the  JDBC channel 
at all.  If no data is arriving then one could say it could be  the JDBC.

If the error is intermittent  then it is likely a resource involved in 
processing is filling to capacity. 

Try reducing the data ingestion volume and see if that completes, then increase 
the data ingested  incrementally.

I assume you have  run the job on small amount of data so you have  completed 
your prototype stage successfully. 


On Saturday, 11 April 2020 Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
Hi,
Have you checked your JDBC connections from Spark to Oracle. What is Oracle 
saying? Is it doing anything or hanging?
set pagesize 9999
set linesize 140
set heading off
select SUBSTR(name,1,8) || ' sessions as on '||TO_CHAR(CURRENT_DATE, 'MON DD 
YYYY HH:MI AM') from v$database;
set heading on
column spid heading "OS PID" format a6
column process format a13 heading "Client ProcID"
column username  format a15
column sid       format 999
column serial#   format 99999
column STATUS    format a3 HEADING 'ACT'
column last      format 9,999.99
column TotGets   format 999,999,999,999 HEADING 'Logical I/O'
column phyRds    format 999,999,999 HEADING 'Physical I/O'
column total_memory format 999,999,999 HEADING 'MEM/KB'
--
SELECT
          substr(a.username,1,15) "LOGIN"
        , substr(a.sid,1,5) || ','||substr(a.serial#,1,5) AS "SID/serial#"
        , TO_CHAR(a.logon_time, 'DD/MM HH:MI') "LOGGED IN SINCE"
        , substr(a.machine,1,10) HOST
        , substr(p.username,1,8)||'/'||substr(p.spid,1,5) "OS PID"
        , substr(a.osuser,1,8)||'/'||substr(a.process,1,5) "Client PID"
        , substr(a.program,1,15) PROGRAM
        --,ROUND((CURRENT_DATE-a.logon_time)*24) AS "Logged/Hours"
        , (
                select round(sum(ss.value)/1024) from v$sesstat ss, v$statname 
sn
                where ss.sid = a.sid and
                        sn.statistic# = ss.statistic# and
                        -- sn.name in ('session pga memory')
                        sn.name in ('session pga memory','session uga memory')
          ) AS total_memory
        , (b.block_gets + b.consistent_gets) TotGets
        , b.physical_reads phyRds
        , decode(a.status, 'ACTIVE', 'Y','INACTIVE', 'N') STATUS
        , CASE WHEN a.sid in (select sid from v$mystat where rownum = 1) THEN 
'<-- YOU' ELSE ' ' END "INFO"
FROM
         v$process p
        ,v$session a
        ,v$sess_io b
WHERE
a.paddr = p.addr
AND p.background IS NULL
--AND  a.sid NOT IN (select sid from v$mystat where rownum = 1)
AND a.sid = b.sid
AND a.username is not null
--AND (a.last_call_et < 3600 or a.status = 'ACTIVE')
--AND CURRENT_DATE - logon_time > 0
--AND a.sid NOT IN ( select sid from v$mystat where rownum=1)  -- exclude me
--AND (b.block_gets + b.consistent_gets) > 0
ORDER BY a.username;
exit

HTH


Dr Mich Talebzadeh

 

LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

 

http://talebzadehmich.wordpress.com




Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destructionof data or any other property which may arise from relying 
on this email's technical content is explicitly disclaimed.The author will in 
no case be liable for any monetary damages arising from suchloss, damage or 
destruction. 

 


On Fri, 10 Apr 2020 at 17:37, Ruijing Li <liruijin...@gmail.com> wrote:

Hi all,
I am on spark 2.4.4 and using scala 2.11.12, and running cluster mode on mesos. 
I am ingesting from an oracle database using spark.read.jdbc. I am seeing a 
strange issue where spark just hangs and does nothing, not starting any new 
tasks. Normally this job finishes in 30 stages but sometimes it stops at 29 
completed stages and doesn’t start the last stage. The spark job is idling and 
there is no pending or active task. What could be the problem? Thanks.-- 
Cheers,Ruijing Li

Reply via email to