No variable explorer or call stack when debugging

Hello,

I’m getting some unexpected behavior when debugging with gdb and the Modern Fortran plugin. Admittedly, I’m a very new gfortran/gcc/gdb user, so it’s possible that I’m just using something incorrectly. Still, any guidance would be very much appreciated.

I’m trying to debug the pw.x program in the QuantumESPRESSO suite. The pwscf.f90 file that is the main entry point for the pw.x program lies in /PW/src directory in the repo. But the build process links pw.x with many libraries built from files lying outside /PW/src, such as /Modules.

When I set a breakpoint in a file lying within /PW/src, everything works as expected: when the breakpoint is hit, VSCode automatically opens the file, highlights the breakpoint line in yellow, and displays the variables and call stack in the debug panel. Additionally, in the debug console, I can see a message from gdb informing me that the breakpoint has been hit.

However, when I set a breakpoint in a file lying in a directory other than /PW/src, (such as /Modules) I get unexpected behavior. VSCode doesn’t open the file or highlight the breakpoint line in yellow, and neither the correct variables nor the correct call stack are displayed in the debug panel. However, I can see a message in the debug console informing me that the breakpoint has been hit.

Since I see in the debug console that the breakpoint is being hit, I’m assuming that the issue isn’t with the build process or with gdb, and instead is with the Fortran VSCode plugin. Or perhaps I’ve set something up incorrectly; that’s a very real possibility.

Any idea of what is going wrong here? Any help is much appreciated.

Given that the debugger spins up correctly I suspect that your VS Code setup is valid, feel free to post the contents of launch.json to make sure.

mp_bcast() is an MPI call, unfortunately GDB does not do MPI debugging in a useful way. You can spin up a GDB instance for every MPI rank, which is not very helpful, some examples on how to do that in VS Code (we use the C++ debuggers behind the scenes) can be found here:

There are good parallel debuggers like TotalView which makes things a lot simpler, but you need to buy a license key.

It looks like that is the first time an MPI function is called since starting the program.
I think it is possible that you’ve launched that case with no mpiexec command, hence have either an invalid MPI communicator ID or even haven’t called MPI_INIT.

Thank you for your replies.

@gnikit, thanks for those links, they were very informative.
However (as I should have specified when I made my original post) I am building serial executables, and am running pw.x directly, without mpirun, so I don’t need to do any MPI debugging.
Here is the relevant configuration from my launch.json file:

{
    "name": "(gdb) Debug pw.x with input file",
    "type": "cppdbg",
    "request": "launch",
    "program": "${workspaceFolder}/bin/pw.x",
    "args": ["-inp", "${workspaceFolder}/telzrow_test_files_for_pycdft/espresso.pwi"], // Possible input args for "program"
    "stopAtEntry": false,
    "cwd": "${workspaceFolder}",
    "environment": [],
    "externalConsole": false,
    "MIMode": "gdb",
    "setupCommands": [
        {
        "description": "Enable pretty-printing for gdb",
        "text": "-enable-pretty-printing",
        "ignoreFailures": true
        }
    ],
    "preLaunchTask": "Build pw.x"
}

The Build pw.x task is defined in my tasks.json file as follows:

{
    "label": "Build pw.x",
    "type": "shell",
    "command": "make pw",
    "presentation": {
        "reveal": "always",
        "panel": "new",
    }
}

@FedericoPerini, thanks for your reply.
I apologize for not specifying this in my original post, but Quantum ESPRESSO uses wrappers around MPI functions, so that serial executables can be built.
So, even though I’m launching the program directly without using the mpiexec command, I don’t believe this should be a problem.

Again, thank you both for your feedback.
Since my original post, I’ve spent quite a bit of time investigating this issue and I believe I’m closer to a solution:

The issue was actually appearing as soon as the open_input_file function was called, which occurs right before that mb_bcast call in the screenshot I posted originally.
I can set a breakpoint at any point in execution up to or including that open_input_file line.
But if I set a breakpoint at any line executed afterwards, even at the very first line of that open_input_file function, the issue appears.
I tried to debug the program directly, rather than using VSCode.
I set a breakpoint at line 48 of the read_input.f90 file, which is shown in the screenshot of my original post.
I also set a breakpoint at line 93 of open_close_input_file.f90, which is the very first line of that open_input_file function.

As you can see in the output below, the program stops at the first breakpoint, and I’m able to run the bt gdb command.
The program continues to the next breakpoint, and I’m able to run the bt gdb command again.
(When debugging with VSCode, this wouldn’t be possible: gdb would have already hung/crashed at this point.)
However, as you can see, when I run the info variables command, gdb crashes with a segfault:

gdb bin/pw.x
GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./bin/pw.x...
(gdb) break Modules/read_input.f90:48
Breakpoint 1 at 0x46f514: file read_input.f90, line 48.
(gdb) break Modules/open_close_input_file.f90:93
Breakpoint 2 at 0x5357b7: file open_close_input_file.f90, line 93.
(gdb) run
Starting program: /home/jamestelzrow/q-e/bin/pw.x 
BFD: error: /usr/lib/debug/.build-id/b5/94dc721d75112eb9f2aa7a2c0ae957f373d962.debug(.debug_info) is too large (0x15ef54 bytes)
warning: Can't read data for section '.debug_info' in file '/usr/lib/debug/.build-id/b5/94dc721d75112eb9f2aa7a2c0ae957f373d962.debug'
warning: Section .debug_aranges in /usr/lib/debug/.build-id/b5/94dc721d75112eb9f2aa7a2c0ae957f373d962.debug entry at offset 0 debug_info_offset 0 does not exists, ignoring .debug_aranges.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

     Program PWSCF v.7.2 starts on  3Nov2023 at 22:12:29 

     This program is part of the open-source Quantum ESPRESSO suite
     for quantum simulation of materials; please cite
         "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
         "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017);
         "P. Giannozzi et al., J. Chem. Phys. 152 154105 (2020);
          URL http://www.quantum-espresso.org", 
     in publications or presentations arising from this work. More details at
     http://www.quantum-espresso.org/quote

     Serial version
     29762 MiB available memory on the printing compute node when the environment starts


Breakpoint 1, read_input::read_input_file (prog=..., input_file_=..., _prog=_prog@entry=2, _input_file_=_input_file_@entry=256) at read_input.f90:48
48           IF ( ionode ) ierr = open_input_file( input_file_, xmlinput )
(gdb) bt
#0  read_input::read_input_file (prog=..., input_file_=..., _prog=_prog@entry=2, _input_file_=_input_file_@entry=256) at read_input.f90:48
#1  0x0000555555565c4b in pwscf () at pwscf.f90:84
(gdb) c
Continuing.

Breakpoint 2, open_close_input_file::open_input_file (input_file_=..., is_xml=.TRUE., _input_file_=_input_file_@entry=256) at open_close_input_file.f90:93
93        IF ( PRESENT(input_file_) ) THEN
(gdb) bt
#0  open_close_input_file::open_input_file (input_file_=..., is_xml=.TRUE., _input_file_=_input_file_@entry=256) at open_close_input_file.f90:93
#1  0x00005555559c361e in read_input::read_input_file (prog=..., input_file_=..., _prog=_prog@entry=2, _input_file_=_input_file_@entry=256) at read_input.f90:48
#2  0x0000555555565c4b in pwscf () at pwscf.f90:84
(gdb) info variables
All defined variables:

File ../csu/abi-note.c:
71:     static const struct {
    Elf64_Nhdr nhdr;
    char name[4];
    int32_t desc[4];
} __abi_tag;

File ../dlfcn/dlerror.h:
83:     static struct dl_action_result * const dl_action_result_malloc_failed;

File ../login/utmp_file.c:
37:     static int file_fd;
39:     static off64_t file_offset;
38:     static _Bool file_writable;
42:     static struct utmp last_entry;

File ../nptl_db/db_info.c:
111:    const uint32_t _thread_db_const_thread_area;

File ../nptl_db/structs.def:
82:     const uint32_t _thread_db___nptl_initial_report_events[3];
80:     const uint32_t _thread_db___nptl_nthreads[3];
84:     const uint32_t _thread_db___pthread_keys[3];
98:     const uint32_t _thread_db_dtv_dtv[3];
116:    const uint32_t _thread_db_dtv_slotinfo_list_slotinfo[3];
95:     const uint32_t _thread_db_link_map_l_tls_modid[3];
96:     const uint32_t _thread_db_link_map_l_tls_offset[3];
66:     const uint32_t _thread_db_list_t_next[3];
67:     const uint32_t _thread_db_list_t_prev[3];
56:     const uint32_t _thread_db_pthread_cancelhandling[3];
60:     const uint32_t _thread_db_pthread_eventbuf[3];
61:     const uint32_t _thread_db_pthread_eventbuf_eventmask[3];
62:     const uint32_t _thread_db_pthread_eventbuf_eventmask_event_bits[3];
93:     const uint32_t _thread_db_pthread_key_data_level2_data[3];
52:     const uint32_t _thread_db_pthread_list[3];
63:     const uint32_t _thread_db_pthread_nextevent[3];
53:     const uint32_t _thread_db_pthread_report_events[3];
58:     const uint32_t _thread_db_pthread_schedparam_sched_priority[3];
57:     const uint32_t _thread_db_pthread_schedpolicy[3];
--Type <RET> for more, q to quit, c to continue without paging--c

(I'm omitting this section of the output because it is a very long list of variables that I don't believe is relevant)

File qexsd.f90:
        integer(kind=8) _F.qexsd_module_MOD_clock_list;


Fatal signal: Segmentation fault
----- Backtrace -----
0x5645c7b4e40e ???
0x5645c7c57601 ???
0x5645c7c57776 ???
0x7feccb45afcf ???
        ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x7feccb586618 __strlen_evex
        ../sysdeps/x86_64/multiarch/strlen-evex.S:79
0x5645c7ba7083 ???
0x5645c7c6c62e ???
0x5645c7c6ca74 ???
0x5645c7ec2341 ???
0x5645c7e54df2 ???
0x5645c7e57795 ???
0x5645c7e5c2b3 ???
0x5645c7e5c440 ???
0x5645c7b80c94 ???
0x5645c7e8e287 ???
0x5645c7c57e1c ???
0x5645c7c593cf ???
0x5645c7c586d1 ???
0x7feccc62246c ???
0x5645c7c587fd ???
0x5645c7c5898f ???
0x5645c7c57d0c ???
0x5645c803f1d5 ???
0x5645c803fcb2 ???
0x5645c7d212f9 ???
0x5645c7d22f74 ???
0x5645c7ab1ca9 ???
0x7feccb4461c9 __libc_start_call_main
        ../sysdeps/nptl/libc_start_call_main.h:58
0x7feccb446284 __libc_start_main_impl
        ../csu/libc-start.c:360
0x5645c7ab8e30 ???
0xffffffffffffffff ???
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible.  GDB will now terminate.

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

Segmentation fault

I’m going to report this crash to the gdb developers.
But am I correct in assuming that either the C/C++ or Modern Fortran plugin probably runs some similar crash-causing gdb command, and then silently crashes without displaying a warning to the user?

Yes that is correct. If you have a look at launch.json you will see references to GDB, there are even options to point to other GDB binaries and pass custom arguments if one wishes so. We inherit all of this infrastructure from C/C++ extension.
As for silently crashing, maybe. There is a debug window in vscode the default might be present there.