Victor Wong created FLINK-15448: ----------------------------------- Summary: Make "ResourceID#toString" more descriptive Key: FLINK-15448 URL: https://issues.apache.org/jira/browse/FLINK-15448 Project: Flink Issue Type: Improvement Affects Versions: 1.9.1 Reporter: Victor Wong
With Flink on Yarn, sometimes we ran into an exception like this: {code:java} java.util.concurrent.TimeoutException: The heartbeat of TaskManager with id container_xxxx timed out. {code} We'd like to find out the host of the lost TaskManager to log into it for more details, we have to check the previous logs for the host information, which is a little time-consuming. Maybe we can add more descriptive information to ResourceID of Yarn containers, e.g. "container_xxx@host_name:port_number". Here's the demo: {code:java} class ResourceID { final String resourceId; final String details; public ResourceID(String resourceId) { this.resourceId = resourceId; this.details = resourceId; } public ResourceID(String resourceId, String details) { this.resourceId = resourceId; this.details = details; } public String toString() { return details; } } // in flink-yarn private void startTaskExecutorInContainer(Container container) { final String containerIdStr = container.getId().toString(); final String containerDetail = container.getId() + "@" + container.getNodeId(); final ResourceID resourceId = new ResourceID(containerIdStr, containerDetail); ... } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)