Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - Printable Version +- Discuss Powder (https://powder.ornl.gov/forum) +-- Forum: Reduce & Collect & Analyze Data (https://powder.ornl.gov/forum/forumdisplay.php?fid=6) +--- Forum: Data Analysis (https://powder.ornl.gov/forum/forumdisplay.php?fid=13) +---- Forum: RMCProfile (https://powder.ornl.gov/forum/forumdisplay.php?fid=22) +---- Thread: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance (/showthread.php?tid=85) |
Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - riekasa - 08-06-2024 I am writing to inform you about an issue I have encountered with the RMCProfile V6.7.9 Linux GPU that I downloaded from https://sourceforge.net/projects/rmcprofile/files/Version_6.7.9/RMCProfile_V6.7.9_Linux_64_GPU.tgz/download. Unfortunately, the program stops functioning and displays an error message at the end of the execution log. Below is a snippet of the execution log: Use CUDA version [font=游ゴシック] [/font] ---------------------------------------------------------------------------- [font=游ゴシック] [/font] RMCProfile version 6.7.9.5 ========================== [font=游ゴシック] [/font] Version type: Version with molecular constraints and EXAFS Version date: Aug-22-2023::01 Release notes: Robust XRAY implementation; auto weights adjustment; topas profile for Bragg Using files: [font=游ゴシック]ABCD[/font] [font=游ゴシック] [/font] ---------------------------------------------------------------------------- [font=游ゴシック] [/font] THREADS_numbr = 24 THREADS_id = 0 THREADS_id = 5 THREADS_id = 21 [font=游ゴシック] [/font] [font=游ゴシック]…[/font] Info> Boltzmann temperature is 2.47771265600000 No atom swap moves will be generated, atoms will only be translated. Since there is no ZMOVESCALE line ZMOVESCALE=1.0 forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source rmcprofile 000000000073D743 Unknown Unknown Unknown libc.so.6 00007F1ED3A42520 Unknown Unknown Unknown rmcprofile 00000000005DF8E2 Unknown Unknown Unknown rmcprofile 000000000045E884 Unknown Unknown Unknown rmcprofile 0000000000404FE2 Unknown Unknown Unknown libc.so.6 00007F1ED3A29D90 Unknown Unknown Unknown libc.so.6 00007F1ED3A29E40 __libc_start_main Unknown Unknown rmcprofile 0000000000404EEE Unknown Unknown Unknown As you can see, the error message appears at the end of the log. I would greatly appreciate it if you could provide guidance or suggest any potential solutions to resolve this issue. [font=游ゴシック] [/font] Here are some additional details that might be helpful: [font=游ゴシック] [/font] Operating System: Ubuntu 22.04.4 LTS [font=游ゴシック] [/font] Thank you very much for your time and assistance. I look forward to your response. [font=游ゴシック] [/font] Best regards, RE: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - y8z-admin - 08-11-2024 can you share the whole RMC folder so we can help debug? Thank you! RE: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - riekasa - 08-19-2024 Thank you for your prompt response and willingness to assist. As requested, I have prepared the entire RMC folder for your review. Please find it attached to this email. I hope this will help you in debugging the issue. Thank you once again for your support. Best regards, RE: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - y8z-admin - 09-08-2024 Checking through your files, I think the issue is probably not something to do with the GPU accelerator. It seems that your input data is not sitting on an equally space grid, inferring that you may want to do a rebin of your data first, before feeding into the RMCProfile fitting. Here is detailed instruction about how-to, https://rmcprofile.ornl.gov/data-pre-processing-for-rmcprofile Also, when you use 'FITTED_SCALE', you may also want to enable 'FITTED_OFFSET'. This may not be an issue but is something good to have. You may also want to check out the weight optimization in the RMCProfile package manual, https://rmcprofile.ornl.gov/manual/ The 'SAVE_PERIOD' is probably too short. I would set it to something like '10 MINUTES'. Not sure whether you have ever gone through the tutorials coming with the package already. If not, I would suggest going through them first. RE: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - riekasa - 09-12-2024 Thank you very much for your detailed feedback and guidance, and for taking the time to thoroughly review my input files. I have followed your suggestions and made the following changes: 1. Rebinning of the input data has been performed. 2. Enabled 'FITTED_OFFSET' alongside 'FITTED_SCALE'. 3. Adjusted the 'SAVE_PERIOD' to '10 MINUTES'. However, despite these changes, I am still encountering the same error, and the calculation stops. I have not yet implemented the weight optimization as I believe it may not be directly related to the issue we are currently facing. Additionally, I would like to mention that when using the CPU version of RMCProfile (version 6.7.9.4) with the same input files, excluding the GPU options, the simulation runs without any errors. This issue seems to be specific to the GPU version. For your information, I have gone through the tutorials and the manual, though only briefly, to familiarize myself with the package. I have attached the current input file and log file (nohup.out) for your review. Could you please provide further guidance on how to resolve this issue? Thank you once again for your support, assistance, and the valuable time you have spent on reviewing my files. RE: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - y8z-admin - 10-22-2024 Sorry for the delay, but attached please find the working version of the setup for your reference. Please let me know if any further issues. RE: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - riekasa - 11-05-2024 Thanks to your guidance and the working version of the setup you provided, I am now able to run the software smoothly on my on-premise computer without any errors. However, I have encountered a new challenge when trying to run RMCProfile in a virtual machine environment on a cloud platform. Specifically, I encountered the following error message when executing the "ulimit -s unlimited" command in the "submit.sh" file: "./submit.sh: line 10: ulimit: stack size: cannot modify limit: Operation not permitted." Additionally, the analysis process stops midway, as indicated by the following log excerpt: -------------- Log excerpt indicating analysis stoppage --------------------- forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source rmcprofile 000000000073D743 Unknown Unknown Unknown libpthread-2.26.s 00007FB2169CF8E0 Unknown Unknown Unknown rmcprofile 00000000005DF8E2 Unknown Unknown Unknown rmcprofile 000000000045E884 Unknown Unknown Unknown rmcprofile 0000000000404FE2 Unknown Unknown Unknown libc-2.26.so 00007FB21663213A __libc_start_main Unknown Unknown rmcprofile 0000000000404EEE Unknown Unknown Unknown I am seeking your guidance on possible solutions for this issue. As I have limited resources on my on-premise computer, I am exploring the option of running RMCProfile in a virtual machine environment on a cloud platform. If you have any recommendations or insights on how to address this issue in a cloud-based virtual machine environment, I would greatly appreciate your guidance. Once again, I would like to express my sincere appreciation for your outstanding support and assistance. RE: Issue with RMCProfile V6.7.9 Linux GPU Request for Assistance - y8z-admin - 11-15-2024 If you can check from the command line whether you can execute `ulimit -s unlimited`, you can then get a feel about whether you really have issues with this command in the first place. The `submit.sh` is for submitting jobs to job manager `slurm` and I am not sure how you were using it on the virtual machine. If you can help share more details, like what OS system you were running on, the contents of your version of `submit.sh` and how you executed the RMC fitting with the script, etc., I can try to see whether I can provide more insights. |