Data Platform/Systems/Refine/Deploy Refinery-source
Refinery-source is the JVM software on the Analytics Cluster using spark or hive UDFs. The source code is in the analytics/refinery/source.
How to deploy with Jenkins (and related steps)
Before starting, please check:
- Check the latest refinery-source deployed version. This can be done in various ways on the git repository, for instance:
- using git tags:
git tag --list
- using the list of commits in the refinery's artifacts directory
- using git tags:
- Look at the Refinery source commit list and make sure that the
changelog.md
has been updated with the latest version and possibly that[maven-release-plugin]
has committed the related version bump changes (the last step is optional, it could be triggered manually following the instruction below). - Check the current SNAPSHOT version committed in the repo's pom.xml files via
git grep SNAPSHOT.
Theanalytics-refinery-maven-release-docker builder
(see below) will use the SNAPSHOT version unless told otherwise, not the version present in the changelog.md (changelog.md is for human tracking purposes).- For example, suppose that the current SNAPSHOT version is
0.0.147-SNAPSHOT
- If you want to simply jump to
0.0.148
, then you are good. - Otherwise, suppose that you want to jump to
0.1.0
. - You need to use something like the following to modify the pom.xml files:
find -name pom.xml -exec sed -e 's/0.0.147-SNAPSHOT/0.1.0-SNAPSHOT/' -i {} \;
- Then you'll need to mention this in the
changelog.md
alongside with the changes related to the code.
- For example, suppose that the current SNAPSHOT version is
- Make sure you are logged in to Jenkins to see the pages linked in the steps below.
Have you read the instructions reported above? If so, please keep going, otherwise please read them!
Deploy procedure:
- Update the changelog.md file at the root of the repository with changes that are going to be deployed - commit and merge this change.
- Release a new version of refinery-source jars to Maven:
- Visit analytics-refinery-maven-release and click the "Build" button (default parameters shouldn't be changed except if you know what you're doing :).
- The job should take a few minutes to run, and an email is being sent to the analytics-alert list with whether the job succeeded or failed (it is also indicated on the page used to launch the job if you don't close it).
- NOTE: it's best to wait a few minutes between the build and the next step.
- Now we need to update the symlinks to the latest refinery-source jars in the refinery repository.
- This can also be done via Jenkins, by going to analytics-refinery-update-jars, and setting the
RELEASE_VERSION
field to the version number of the latest jars released, without the v prefix. - Example:
v0.0.40is bad, 0.0.40 is good. - Then hit
Build
. - This will send a code-review to gerrit in the refinery repository and send a message to the
#wikimedia-analytics
IRC channel as well as an email to the analytics-alerts mailing list. - Look at the code review generated by Jenkins, it should contain the new jars and updates for the default-jars symlinks.
- If it looks good (normally, yes :), you can merge it and continue the deployment process.
- This can also be done via Jenkins, by going to analytics-refinery-update-jars, and setting the
- There are most chances that, having deployed some new/changed java/scala code, you wish to apply it in refinery and Airflow.
- deploy refinery with your changes
- Considering Airflow jobs:
- bump up the jar version in the artifact.yaml file in your Airflow dag repository
- use the correct artifact in your dag
- deploy your Airflow dag repo
Remember to post a log entry to the analytics IRC channel upon successful deploy, for instance: !log Deployed refinery-source using jenkins .) |
If the maven release job failed (step 2)
First find why looking at job logs, and then skip the version number you were currently deploying. Update the changelog,md with the skipped version and build again!
If you really want to build a specific version, then some extra steps are needed:
- Remove the git tag of the versions that you don't want (local and remote of course)
- Update the pom.xml files with the version that you want to build (as mentioned above, the -SNAPSHOT version).
- Add these info to the
changelog.md
(if needed) and merge the change. - Drop the release version artifacts that you don't want from archiva.
- Kick off the build again!
Administration
- Archiva-CI credentials are stored in Jenkins at https://integration.wikimedia.org/ci/credentials/.
- Jenkins Maven / Archiva settings are at https://integration.wikimedia.org/ci/configfiles/.
- You can edit the analytics-refinery-release Jenkins job at https://integration.wikimedia.org/ci/job/analytics-refinery-release/configure
- Maven Jenkins job config is in the integration/config repository
Changing the archiva-ci password
Archiva expires user passwords periodically (every 90 days?). This means we need to update the archiva-ci user's password in Archiva, and also in the Jenkins stored credential.
Log into archiva.wikimedia.org as the admin user (the password is in the ops pwstore). Go to 'Manage', click on the edit (pencil) button for the archiva-ci and change the password.
Then, log into jenkins and go to the archiva-ci credentials page. Click on the 'Update' link, then click on Change Password, and set the password to the same one you set in Archiva.
How to deploy from the CLI
- Update the changelog.md file at the root of the repository whith changes that are going to be deployed.
- Prepare deployment (change pom.xml files, push to git):
mvn -Duser.name=YOUR_WIKITECH_USERNAME release:prepare
- Check everything looks Ok:
cat release.properties
- Actually deploy (jar generation and uploads to archiva)
mvn -Duser.name=YOUR_WIKITECH_USERNAME release:perform
- Download the new version of the jars from archiva ( Warning: git fat uses IDs to manage files, so it is important to use the correct jars.). For convenience, the commonly updated jars are:
- NOTE: to download, click on the new version you just created, then Artifacts, then copy the link to the jar and update 127.0.0.1:8080 with archiva.wikimedia.org and download that
- Copy the updated jars to the correct refinery path (somewhere like <refinery>/artifacts/org/wikimedia/analytics/refinery/)
- Update the symlinks in refinery/artifacts to the jars you just copied
- Make sure git fat is installed and configured according to the instructions in the Refinery README
git add . && git commit
(git fat will do some magic and replace the jar with a one-line id)- push for review
Please see the refinery page to deploy the jars and oozie code.