Note: This post was written about Azure Data Factory V1, but is also applicable to V2.
On a recent project I used Azure Data Factory (ADF) to retrieve data from an on premises SQL Server 2014 instance and land them in Azure Data Lake Store (ADLS) as ORC files. This required the use of the Data Management Gateway (DMG). Setup was quick and easy in our development environment. We installed the DMG for development on a separate server in the client’s network, where we also installed SQL Server Management Studio (SSMS) for query development and data validation. We set up resource groups in Azure for development and production, and made sure the settings for development and production were the same. Then we set up a separate server for the production DMG.
Deployment and execution went well in the dev environment. Testing was completed, so we deployed to prod. Deployment went fine, but the pipelines failed execution and returned the following error on the output data sets.
The Java Runtime Environment (JRE) is not required for the DMG to run successfully, but it is needed for the creation of ORC files. The only problem with this error message was that we did indeed have the JRE installed on the server.
After reinstalling the JRE and the DMG and getting the same error, I consulted the troubleshooting guide. After finding nothing relevant there, I asked some colleagues for suggestions.
- I double-checked that I had the same version of the DMG that worked in dev and that I had the most current version of the JRE.
- I double-checked that the DMG and JRE matched bit-wise (32-bit vs. 64-bit). Both were 64-bit in my case.
- I checked that JAVA_HOME was set correctly in the environment variables.
When none of those things worked, I logged a support ticket with Microsoft. They had me do the following:
- Check the registry key – HKEY_LOCAL_MACHINE\Software\JavaSoft\Java Runtime Environment should have a Current Version entry that shows the current JRE version.
- Check that the subkey in the folder labeled with the version has a JAVAHOME entry with the correct path (something like C:\Program Files\Java\jre1.8.0_74).
- Open the path and check that the bin folder exists.
- Check that jvm.dll exists in the bin/server folder.
When none of those things worked, they gave me one last suggestion:
Install the Microsoft Visual C++ 2010 Redistributable Package.
And that turned out to solve the problem!
After review, we realized that we had installed SSMS on the dev DMG server but not on the prod DMG server. SSMS would have required the installation of the C++ redistributable package, which is why we didn’t encounter this error in dev.
I will confess that I don’t understand exactly why missing C++ libraries manifest themselves in an error claiming a missing Java Runtime Environment. If you have a good explanation, please leave it in the comments and I’ll update this and give you credit.
I hope that someone else who runs into this issue will find this blog post and avoid days of troubleshooting and confusion.