The OASIS Coupler Forum

  HOME

Problem running tutorial coupled model

Up to Starting with OASIS3-MCT (first steps, tutorial, ...)

Posted by Anonymous at November 23 2016

Hi,

I am working at UPM and we are planning to couple NEMO ocean model with PROMES atmosphere model. I just download and compile OASIS3-MCT and I also compile the tutorial model. I could also run the tutorial model and apparently it worked ok.

Now I am trying to run the tutorial coupled model, so I followed the instructions of the document, I modified the model1.F90 and model2.F90 and I generate the namcouple. Nevertheless, when running the model interfacing with oasis I get an error and oasis finish the execution with an "oasis_abort".

In the "debug.01.000000" file a get this: (oasis_coupler_setup) ERROR: namcouple variable not used: FSENDATM (oasis_coupler_setup) ERROR: namcouple variable not used: FSENDOCN (oasis_abort) ABORT: on model = model1 (oasis_abort) ABORT: on global rank = 0 (oasis_abort) ABORT: on local rank = 0 (oasis_abort) ABORT: CALLING ABORT FROM OASIS LAYER NOW what suggests that there is a problem in "namcouple" file configuration, but I checked that all the variables are defined properly because using the namcouple in "data_oasis" directory I get the same error. I have also been able to check that the error appears when calling "oasis_enddef" function.

Do you know what is going wrong?

Thanks in advance. Regards Pedro

Posted by Anonymous at November 25 2016

Hi Pedro,

If you have this error message that means that the variables are defined in the namcouple but you do not define them correctly in your programs with oasis_def_var. You have the solution of tutorial in model1.F90_TP and model2.F90_TP if you want to compare to what you did.

Best regards, Laure

Posted by Anonymous at November 28 2016

Thanks Laure for your answer.

When I saw the error in the debug file I first checked whether the configurarion in the model1.F90 and model2.F90 were in agreement with the definitions in the namcouple file, and they were. I also checked in this forum and the answer was the same, there should be a wrong definition in any of those files. So I decide to use model1.F90_TP and model2.F90_TP as well as the namcouple_TP in the data_oasis3 directory, but again I got the same error. Do you have any idea what can be wrong if I get this error even using the *_TP files?

Thanks in advance Pedro

Posted by Anonymous at November 29 2016

Hi Pedro.

I reproduced your problem using model1.F90_TP and model2.F90_TP in tutorial.

In fact you have also to modify run_tutorial to launch the coupled model (at the end of the script).

For example on my computer : I had to modify the command to run the uncoupled models :

$MPIRUN -np $nproc_exe1 ./$exe1 > runjob.err

$MPIRUN -np $nproc_exe2 ./$exe2 >> runjob.err

into :

$MPIRUN -np $nproc_exe1 ./$exe1 : -np $nproc_exe2 ./$exe2 > runjob.err

Best regards, Laure

Posted by Anonymous at December 5 2016

Hi Laure

I have been working on oasis but I still cannot run the tutorial example. I always ran the example using $MPIRUN -np $nproc_exe1 ./$exe1 : -np $nproc_exe2 ./$exe2 > runjob.err as you suggested, so that was not the problem.

After several attempts, I decided to add some comments to the code in model1.F90_TP and model2.F90_TP, specifically in the oasis call sentences. I could check that the problem arises when the call to "oasis_enddef" is not commented. If I comment this line and the calls to oasis_get/put, the example runs and finishes, but with the call to oasis_enddef uncommented, it crashes. In my computer I am using the "ubuntu" configuration, if that can help.

Thanks again and regards Pedro

Posted by Anonymous at December 6 2016

Hi Pedro,

I am sorry but I do not understand what can be wrong. The tutorial toy is a very simple toy. Which version of OASIS3-MCT do you use ?

I do not know if this will do something but could you again load the tutorial, copy the data_oasis3/namcouple_TP into data_oasis3/namcouple ? Then copy model1.F90_TP in model1.F90 and model2.F90_TP in model2.F90, recompile all, rerun all and let me know.

Best regards, Laure

Posted by Anonymous at January 26 2017

Hi Pedro and Laure,

I wanted to wait in and say I have the same problem. And following Laure's latest post I still get the same results. Have either of you made any headway? I'm running the latest version downloaded via svn (r 1874) on a linux box (CentOS Linux release 7.2.1511).

Cheers, Einar

Posted by Anonymous at January 27 2017

Hi Einar,

In fact the master of OASIS (r1874) is still under development, I am not sure that tutorial works with the version r1874 of OASIS.

You must download the official version of OASIS located on the corresponding stable and tested branch OASIS3-MCT_3.1, using the command : 
git clone https://gitlab.com/cerfacs/oasis3-mct.git

We make our developments on the trunk using different branches and when a new official release is ready for the users (after tests and debug) we put it on a stable branch where we only fix important bugs.

Let me know if it woks better with 3.1 .

Best regards, Laure

Ps : which compiler do you use on your platform ?

Posted by Anonymous at January 30 2017

Hi Laure,

I’ve figured this out now. I had some bad references in my PATH variable and so the run_tutorial script called the wrong version of mpirun. Strangely enough the program ran, but produced the same error as Pedro reported. So I guess this error can be, in general traced back to misbehaving mpirun or openmp/mpich implementations. Maybe this knowledge can help Pedro.

Also, I was using the git command you posted.
And I’m using the GNU compilers (gfortran) at the moment for development. Our production system uses the Portland compilers (although GNU is also available), but I haven’t tried those yet.

Cheers, Einar

Posted by Anonymous at February 8 2017

Hi Laure and Einar,

After many attempts working on oasis tutorial using different configurations and versions, I could make it work. The problem was exactly the one reported by Einar in the last post, bad references in the path related to mpi. Once I solve the bad references the tutorial worked properly.

Thanks to both for your help Regards Pedro
Reply to this