Apache Spark Streaming : Logging Driver Executor logs the right way
From my experience, i feel logging properly is one of the most important thing to do first when starting Spark Streaming development especially when you are running on cluster with multiple worker machines.
Reason is simple : Streaming is a continuous running process and the exception/error may arrive after many hours/days and it can be because of driver or can be because of executor. It will be hard to debug the root cause as driver logs are coming in console cannot be seen after application shuts down while executor logs come in std out/err files ( i am using Mesos as cluster manager) which is tedious to download and see. So when some issue comes, like in my case an out-of-memory issue came after 2 days of running and application went down. I had to be sure whether driver or executor was the actual culprit where issue came first. So i first did logging configuration properly before debugging the issue.
Also with logging, we can control how much retention/days of logs we want to keep for driver and executor so that disk space is not eat up by logs generated by ever running application. And if we are running multiple spark streaming applications on the same cluster , we can enable logging to separate log files for different executors even if multiple executors happen to run on same worker machine.
Logging Configuration Steps :
I am using standard apache logger library for logging with appropriate logging levels in code. Default spark log4j properties template can be found in the spark conf directory. For example in my case, its at /usr/local/spark-1.5.1-bin-hadoop2.6/conf/ : log4j.properties.template
1. Create separate log4j configuration files for driver and executor and place them at conf/ directory. Each of them should be configured with different file name and other rolling properties as per use case. In case of multiple applications, create the same separately and differently for each application . For example, i have 2 streaming application named Request and Event. So, have created 4 files as:
Contents of log4j file should be like :
log4j.rootCategory=INFO, FILE
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.FILE=org.apache.log4j.RollingFileAppender
log4j.appender.FILE.File=/tmp/eventLogDriver.log
log4j.appender.FILE.ImmediateFlush=true
log4j.appender.FILE.Threshold=debug
log4j.appender.FILE.Append=true
log4j.appender.FILE.MaxFileSize=500MB
log4j.appender.FILE.MaxBackupIndex=10
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
2. Copy the above files to similar conf/ directory .........continue reading