The OASIS Coupler Forum

  HOME

load balancing problem

Up to Specific issues in real coupled models

Posted by Anonymous at December 20 2023

I have activated the load balancing option with the third number of NLOGPRT = 1. But when I run the coupled WRF-CROCO-WWIII model, it warns the oasis_debug (first number of NLOGPRT) must be 0. 

When I changed it to zero, NLOGPRT = (0 1 1), the model can run normally, but there is no timeline_xxx_component.nc files.

Anybody has met the same issue?

Posted by Anonymous at December 20 2023

Hi,
I am not sure why there is this warning. Can you please cut and paste the exact message you get?
I ran with 1  0  1 and everything is fine, the timeline files are produced. Maybe you should try that.
  Sophie

Posted by Anonymous at December 21 2023

Hi Sophie

Thank you for your reply.
Here is the warning message when I set NLOGPRT = (1 1 1)


(oasis_init_comp) WARNING: WARNING: With load balance analysis
OASIS_debug should be 0

(oasis_init_comp) WARNING: WARNING: With load balance analysis
OASIS_debug should be 0

(oasis_init_comp) WARNING: WARNING: With load balance analysis
OASIS_debug should be 0

(oasis_init_comp) WARNING: WARNING: With load balance analysis
OASIS_debug should be 0

(oasis_init_comp) WARNING: WARNING: With load balance analysis
OASIS_debug should be 0


I will try your settings.
Best,
Han

Posted by Anonymous at December 21 2023

By the way, Sophie, 

I have another silly question. when you use (1 0 1) and successfully get timeline files. Are these files produced during your simulation along with each coupling time step or they are generated when the simulation is finished?

Han

Posted by Anonymous at December 21 2023

Hi Sophie,

I have tried your 1 0 1 setting, but I still got the warning message. And my coupled model cannot stop automatically when it finished the simulation.

That is to say, the model keeps running infinitely, but the log file shows that the simulation period has already finished. And I didn't get any timeline files.

Could you please provide more detailed suggestions?

Han

Posted by Anonymous at December 22 2023

Hi Han,
I am not sure why your model does not stop when it finishes the simulation. Are you sure this has to do with the numbers you put below NLOGPRT?
As I can't reproduce your problem, I am not sure how to try to solve it. If you could reproduce it with "toy" models (i.e. programs that are not real models but that are really coupled, like the programs that you can find in /examples/tutorial_communication) then I could try it myself and hopefully understand what is happening. Let me know if you can do this ...
  With best regards and season greetings,
  Sophie

Posted by Anonymous at January 2 2024

Hi Sophie,

I will try to build a toy model.

For my real model, I am using OASIS3-MCT 5.0. We have found out that when the "load balancing" option is deactivated, the model can finish and quit normally, but when it is activated, we noticed that after the simulation is done, the model cannot quit normally, and it is "wrf.exe" that cannot quit normally.

So, we suspect there are some conflicts between our WRF and OASIS.

Han

Posted by Anonymous at January 2 2024

Which version WRF are you using? Is there any job scheduler? 
Can you check the first number with 1 and the second -1. and post debug.root.xx  see what happened?

Posted by Anonymous at January 2 2024

can you check with 
--------------
$NLOGPRT
 30 -1 1
--------------
30 is for full debug information

Check their have no issues/ warning.

 ---- ENTER (oasis_mpi_chkerr)
 ---- EXIT  (oasis_mpi_chkerr)
 -- EXIT  (oasis_mpi_bcastl1)
 (oasis_unitget)        9999
 starting wrf task            3  of            8
 starting wrf task            6  of            8
 starting wrf task            2  of            8
 starting wrf task            1  of            8
 starting wrf task            4  of            8
 starting wrf task            7  of            8
 starting wrf task            0  of            8
 starting wrf task            5  of            8
 CPL-CROCO: sent CROCO_SST           1
 CPL-CROCO: sent CROCO_EOCE           2
 CPL-CROCO: sent CROCO_NOCE           3
 CPL-CROCO: received CROCO_EVPR           4
 CPL-CROCO: received CROCO_SRFL           5
 CPL-CROCO: received CROCO_STFL           6
 CPL-CROCO: received CROCO_TAUM           7
 CPL-CROCO: received CROCO_ETAU           8
 CPL-CROCO: received CROCO_NTAU           9

Please post your debug.01.000000 and nout.000000 .
Best
Subhadeep Maishal
Reply to this