Hi,

Yes, the binary works correctly on my host. Also, I was able to load it in an Ubuntu 18 FS simulation using gem5.

Before the panic condition is triggered, an exception is thrown, which does not occur when running on host. I tried to debug gem5 by running it on gdb and using "catch throw" to intercept exceptions. However, the gdb only stopped at the panic condition and not when the exception was thrown, which makes me believe that the exception was thrown by libtorch.

"Also, as a sidenote: If you’re wanting to simulate PyTorch, don’t you want to simulate a GPU too?"

Maybe in the future. At the moment I'm focused on getting a basic simulation of CPU+libtorch working :)

On 19.07.23 02:55, Bobby Bruce via gem5-users wrote:
I’m afraid I don’t know exactly what’s causing this error, but just to make sure, the binary you built and as a `CustomResource` executes on your host? This looks like an error coming from PyTorch, not the simulator. That being said, I don’t understand why "build/X86/sim/faults.cc:61: panic: panic condition !FullSystem occurred: fault (General-Protection) detected @ PC “ is occurring after either, that could also be the issue. Personally, I’m always a bit scared linking to dynamic libraries on the host as well,

If you want to get around this the annoying advice is to use FS mode. It’s slower, and requires creation of a disk image, but it isn’t nearly as error-prone as SE mode. If your binary works on your host then you should be able to get it to work in FS mode. Using checkpoints and (if you have the right hardware and are using X86) KVM cores can speed things up for you too.

Also, as a sidenote: If you’re wanting to simulate PyTorch, don’t you want to simulate a GPU too?

--
Dr. Bobby R. Bruce
Room 3050,
Kemper Hall, UC Davis
Davis,
CA, 95616

web: https://www.bobbybruce.net

On Jul 14, 2023, at 3:02 AM, Caio Vieira via gem5-users <gem5-users@gem5.org> wrote:


Hi everyone,

I'm trying to execute gem5 simulations using libtorch in SE mode. However, I get the following error message:

--- Error message ---
...
terminate called after throwing an instance of 'std::runtime_error'
  what():  expected eof but found 'ident' here:
aten::quantized_lstm.inpr input, Tensor[] orch.classes.rnn.CellPara[] params, bool has_biases, int num_layers, float dropout, bool train, bool bidirectional, bool batch_first, *, ScalarType? dtype=None, bool use_dynamic=False) dy
namic=False) -> (Tensor, Tensor, Tenso Tensor, Tensor)
                          ~~~~~ <--- HERE

build/X86/sim/syscall_emul.cc:86: warn: ignoring syscall rt_sigprocmask(...)
      (further warnings will be suppressed)
build/X86/sim/syscall_emul.cc:86: warn: ignoring syscall rt_sigaction(...)
      (further warnings will be suppressed)
build/X86/sim/faults.cc:61: panic: panic condition !FullSystem occurred: fault (General-Protection) detected @ PC (0x7fff7a3d5898=>0x7fff7a3d5899).(0=>1)
Memory Usage: 11842716 KBytes
Program aborted at tick 294083905383
--- BEGIN LIBC BACKTRACE ---
...

The simulation fails before the first line of the main function. I believe that it is failing to load the libtorch library. Unfortunately, it is not possible to build libtorch with "-static" since their static builds is broken for quiet a long
time: https://github.com/pytorch/pytorch/issues/21737
I've tested with gem5 v22.1.0.0 and also 22.0.0.2. I've also tested using different GCC versions to build the simulated binary.

For anyone interested in reproducing the error, I'm sending a "setup.sh" script to create a minimal reproducible environment. Simply copy and paste the script below and name it as "setup.sh" in a new directory, then:

source setup.sh
cmake --B build -S .
cmake --build build
./<gem5> config.py build/main

Best regards,
Caio Vieira

--- setup.sh ---

#!/bin/bash

# Bash script to create minimal reproducible environment for libtorch simulation # bug. This script creates necessary files such as a CMakeLists.txt and a minimal # main.cpp. The CMakeLists.txt file downloads and manages libtorch by saving it
# in a ""_deps"" folder. Steps to reproduce the bug:
# ./<this-script>
# cmake -B build -S .
# cmake --build build
# ./<gem5> config.py build/main

function create_cmake() {
    cat > CMakeLists.txt <<- \EOF
cmake_minimum_required(VERSION 3.22 FATAL_ERROR)

# Download and manage libtorch dependency
set(DEPENDENCY_DIR "${CMAKE_CURRENT_LIST_DIR}/_deps")

file(MAKE_DIRECTORY "${DEPENDENCY_DIR}")
if(NOT EXISTS "${DEPENDENCY_DIR}/libtorch")
    file(DOWNLOAD
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.0.0%2Bcpu.zip
        "${DEPENDENCY_DIR}/libtorch.zip")
    file(ARCHIVE_EXTRACT
        INPUT "${DEPENDENCY_DIR}/libtorch.zip"
        DESTINATION "${DEPENDENCY_DIR}")
    file(REMOVE "${DEPENDENCY_DIR}/libtorch.zip")
endif()
set(CMAKE_PREFIX_PATH "${DEPENDENCY_DIR}/libtorch")
find_package(Torch REQUIRED)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}")

project(main)

# Create executable
add_executable(${CMAKE_PROJECT_NAME} main.cpp)
target_include_directories(${CMAKE_PROJECT_NAME} PUBLIC ${TORCH_INCLUDE_DIRS})
target_link_libraries(${CMAKE_PROJECT_NAME} ${TORCH_LIBRARIES})
set_property(TARGET ${CMAKE_PROJECT_NAME} PROPERTY CXX_STANDARD 14)
message("Torch Libraries: ${TORCH_LIBRARIES}")
EOF
}

function create_main() {
    cat > main.cpp <<- \EOF
#include <iostream>
#include <torch/torch.h>

int main(int argc, char *argv[]) {
    std::cout << "Hello World" << std::endl;
}
EOF
}

function create_gem5_config() {
    cat > config.py <<- \EOF
# This script must be executed by gem5
# Usage: ./<gem5> <this-script> <binary>

import sys

from gem5.utils.requires import ISA
from gem5.components.boards.simple_board import SimpleBoard
from gem5.components.cachehierarchies.classic.no_cache import NoCache
from gem5.components.memory.single_channel import SingleChannelDDR3_1600
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.components.processors.cpu_types import CPUTypes
from gem5.resources.resource import CustomResource
from gem5.simulate.simulator import Simulator

if len(sys.argv) == 1:
    print("Provide a binary as argument", file=sys.stderr)
    sys.exit(1)

cache_hierarchy = NoCache()
memory = SingleChannelDDR3_1600("8GiB")
processor = SimpleProcessor(cpu_type=CPUTypes.ATOMIC, num_cores=1, isa=ISA.X86)

board = SimpleBoard(
        clk_freq="3GHz",
        processor=processor,
        memory=memory,
        cache_hierarchy=cache_hierarchy
)

binary = CustomResource(sys.argv[1])
board.set_se_binary_workload(binary)

simulator = Simulator(board=board)
simulator.run()
EOF
}

create_cmake
create_main
create_gem5_config
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


_______________________________________________
gem5-users mailing list --gem5-users@gem5.org
To unsubscribe send an email togem5-users-le...@gem5.org
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to