Earth Sciences & Env.
Engineering & Tech.
Information & Comm. Tech.
Life Sciences & Biotech
Use this search facility to find out more about the profile of our HPC-Europa2 visitors, the type of work they have been doing, and their project achievements.
The proposal of this paper is to use idle CPU cycles to compress network messages in advance and send then compressed only if the compression is finished by the time the delivery is required. This approach ensure that compressed messages are only used when the network link is becoming the bottleneck and also that in any other case the technique will not affect the performance, The main guidelines for this method are the following:
- Overlapping transmission and compression. The idea is to set compression threads in parallel with the MPI process that compress messages while the sender is waiting for the receiver to be ready or sending other data.
- Data is sent as soon as possible whether it is compressed or not. Messages are split for managing compression. Thus when the time comes, the compressed slices are sent first. If not those slices waiting to be compressed are sent uncompressed instead. Finally The slices that are being compressed by the threads are stopped and sent uncompressed as well.
The achievements included mainly an implementation using OpenMPI framework as following:
- A new base-system (OPAL) framework for compression is included with a LZO implementation module.
- A new version of the OB1 module for the point-to-point (PML) framework is include to implement the compressed rendezvous protocol.
Ceria CeO2 is well known as a key component of the automotive catalysts and efficient catalysts or support for a series of other catalytic process of great industrial and environmental interest. Some of the most important applications of these systems are related to their reactivity towards CO or CO2 in relation to reduction of pollutions and greenhouse gases. In addition, it is found experimentally as appropriate support for noble metal catalysts for several reactions directed to production or deep purification of hydrogen as environmentally friendly fuel in fuel cells. Some of these reactions as water-gas shift reaction and preferential oxidation of CO in hydrogen feed, include as a key step interaction of the ceria with carbon monoxide (CO) and carbon dioxide (CO2). Two features of ceria are found very important for the activity and performance of the support/catalyst: the size of ceria particle (the nanoparticles are found by several orders of magnitude more active that large particles) and the degree of oxidation/reduction of the sample. By this reason our computational investigation was based on ceria particle (Ce21O42) designed in the host group (instead of extended ceria surface used in the other computational studies). During the research stay we performed extensive calculations along the following directions:1. Determination of preferable position for formation of oxygen vacancies on the ceria nanocluster and estimation of the energy for vacancy formation;2. Interaction of CO and CO2 with ceria nanoparticle;3. Interaction of platinum cluster (Pt8) with the ceria nanoparticle
The search for different positions of oxygen vacancies was started from the stoichiometric ceria nanocluster Ce21O42 by removal of selected oxygen atoms. Such removal of neutral oxygen atom leads to appearance of two Ce(III) ions from two Ce(IV) ions in the initial stoichiometric particle and one of the unclear issues in the preferred location of those Ce(III) centers. According to the calculated relative energies of the structures, the stability of the O vacancies decreases in the following order: sub-surface layer, surface, edge. The preferable location of the Ce(III) ions are neighbors of the O vacancy when it is on the surface or edge, or farther from it when it is sub-surface.
The interaction of the ceria nanoparticle with CO resulted in spontaneous formation of CO2 or of surface carbonate when the local structure of the particle is suitable. The energy for CO2 formation from CO (accompanied with formation of an oxygen vacancy on the cluster) is found between 0.8 and 1.5 eV. Interaction of CO2 with the catalystís surface also results in formation of carbonate species with different stability depending on the adsorption mode of the carbonate and its location (edge, corner or facet). The simulated vibrational frequencies of the carbonates are found similar to the experimentally measured frequencies of some of the surface species.
In order to find stable location of the platinum cluster Pt8 on the ceria nanoparticle we checked 8 different structures with variation of the shape and location of the metal cluster. The optimized structures, initially with 4 unpaired electrons, were re-optimized with 2 and with 6 unpaired electrons. In one case the state with 4 unpaired electrons (one of which on the ceria part forming Ce(III) center) was found the most stable, while for the other structures the state with lower spin multiplicity (triplet) was found with lower energy. In the most stable structure the binding energy of platinum was calculated at 5.6 eV with five platinum atoms interacting with the ceria surface.†
The calculations are performed with periodic plane-wave DFT method with PW91 exchange-correlation functional as implemented in VASP program. The kinetic energy cut-off was selected at 415 eV and a cube with dimensions of 2.00 nm each side was selected as the unit cell for the calculations. This size provides ca. 1.00 nm distance between nanoparticles in two neighboring unit cells. Due to the internal deficiency of pure DFT functionals to describe localized electrons, we applied the DFT+U approach in order to provide proper localization of the extra electron on the individual Ce(III) cations.
The investigation is supported by HPC-Europa2 program at Barcelona Supercomputer Center.
The HPC-Europa2 project has allowed the author to optimise the parallel performance of a high order an unsteady unstructured high order (≥ 3) h/p Discontinuous Galerkin - Fourier solver for the incompressible Navier-Stokes equations on static and rotating meshes in two and three dimensions. This general purpose solver is used to provide insight into cross-flow (wind or tidal) turbine physical phenomena.
Simulation of this type of turbine for renewable energy generation needs to account for the rotational motion of the blades with respect to the fixed environment. This rotational motion implies azimuthal changes in blade aero/hydro-dynamics that result in complex flow phenomena such as stalled flows, vortex shedding and blade-vortex interactions.
Simulation of these flow features necessitates the use of a high order code exhibiting low numerical errors. The authorís doctoral work presents the development of such a high order solver, which has been conceived and implemented from scratch by the author.
To account for the relative mesh motion, the incompressible Navier-Stokes equations are written in arbitrary Lagrangian-Eulerian form and a non-conformal Discontinuous Galerkin (DG) formulation (i.e. Symmetric Interior Penalty Galerkin) is used for spatial discretisation. The DG method, together with a novel sliding mesh technique, allows direct linking of rotating and static meshes through the numerical fluxes. This technique shows spectral accuracy and no degradation of temporal convergence rates if rotational motion is applied to a region of the mesh. In addition, analytical mappings are introduced to account for curved external boundaries representing circular shapes and NACA foils.
To simulate 3D flows, the 2D DG solver is parallelised and extended using Fourier series. This extension allows for laminar and turbulent regimes to be simulated through Direct Numerical Simulation and Large Eddy Simulation (LES) type approaches.
Various 2D and 3D cases have allowed the evaluation of the code performance using the Paraver program developed at BSC. During the authorís visit to BSC, the parallel efficiency has been improved through the complementary use of both MPI and OPENMP paradigms throughout the solver. Overall, a 60% decrease in computational cost has been achieved.
- Use of Paraver program developed at BSC to explore the author's solver parallel efficiency.
- Find bottelnecks in the solver performance.
- Improve MPI performance for simulations where dynamic mesh rotation is involved and the high order sliding mesh technique used.
- Further improvements in performance have been achieved through OPENMP parallelisation of selected fucntions and loops.
- Overall, a 60% decrease in computational cost has been achieved.