[ 
https://issues.apache.org/jira/browse/AIRFLOW-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710075#comment-16710075
 ] 

jack commented on AIRFLOW-3058:
-------------------------------

Yep. Added -u flag to the run of the script solve this issue.

I think it's worth mentioning somewhere in the docs though... the problem shown 
here isn't the delay of the print but the fact that the time stamp is incorrect.

The timestamp is of the actual print to the log and not of the time it was 
entered to the buffer. This is highly confusing. 

> Airflow log & multi-threading
> -----------------------------
>
>                 Key: AIRFLOW-3058
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3058
>             Project: Apache Airflow
>          Issue Type: Task
>            Reporter: jack
>            Priority: Major
>         Attachments: 456.PNG, Sni.PNG
>
>
> The airflow log does not show messages in real time when executing scripts 
> with Multi-threading.
>  
> for example:
>  
> The left is the Airflow log time. the right is the actual time of the print 
> in my code. If I would execute the script without airflow the console will 
> show the times on the right.
> !Sni.PNG!
> {code:java}
> 2018-09-13 14:19:17,325] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 
> 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 14:14:55.230044 
> Thread: Thread-1 Generate page: #0 run #0 with URL: 
> http://...&cultureid=2&offset=0&limit=1000
> [2018-09-13 14:19:17,325] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 
> 14:14:55.231635 Thread: Thread-2 Generate page: #1 run #0 with URL: 
> http://...&cultureid=2&offset=1000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 
> 14:14:55.233226 Thread: Thread-3 Generate page: #2 run #0 with URL: 
> http://...&cultureid=2&offset=2000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 
> 14:14:55.234020 Thread: Thread-4 Generate page: #3 run #0 with URL: 
> http://...&cultureid=2&offset=3000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:43.100122 Thread: Thread-1 page 0 finished. Length is 1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:43.100877 Thread: Thread-1 Generate page: #4 run #0 with URL: 
> http://...&cultureid=2&offset=4000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:46.254536 Thread: Thread-3 page 2 finished. Length is 1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:46.255508 Thread: Thread-3 Generate page: #5 run #0 with URL: 
> http://...&cultureid=2&offset=5000&limit=1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:51.096360 Thread: Thread-2 page 1 finished. Length is 1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:51.097269 Thread: Thread-2 Generate page: #6 run #0 with URL: 
> http://...&cultureid=2&offset=6000&limit=1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:53.112621 Thread: Thread-4 page 3 finished. Length is 1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:15:53.113455 Thread: Thread-4 Generate page: #7 run #0 with URL: 
> http://...&cultureid=2&offset=7000&limit=1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:16:37.345343 Thread: Thread-3 Generate page: #8 run #0 with URL: 
> http://...&cultureid=2&offset=8000&limit=1000
> [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 
> 14:16:37.701201 Thread: Thread-2 Generate page: #9 run #0 with URL: 
> http://...&cultureid=2&offset=9000&limit=1000
> [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,291] {bash_operator.py:101} INFO - 2018-09-13 
> 14:16:47.283796 Thread: Thread-1 page 4 finished. Length is 1000
> [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: 
> [2018-09-13 14:19:17,291] {bash_operator.py:101} INFO - 2018-09-13 
> 14:17:27.169359 Thread: Thread-2 page 9 finished. Length is 1000
>  
> {code}
> This never happens when executing regular code.. Happens only with 
> multi-threading. I have some other scripts that the airflow print appears 
> after more than 30 minutes.
>  
>  Check this one:
> hours of delay and then printing everything together. These are not real 
> time. the prints in the log has no correlation to the actual time the command 
> was executed.
>  
> !456.PNG!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to