Reject invalid ParsedNode instances when loading them into Liquibase objects

Description

Liquibase is currently ignores ParseNodes it doesn't like, and it does so silently. I think we should enhance ParseNode loading to let developers know when changes (or other elements) are invalid.

Consider the following scenario: A user wants to add a column to a table, and it needs to have a default value. The developer uses the following YAML changelog:

You've probably spotted the error with this change already - I forgot to capitalize the "V" in "defaultValue". This is not an issue in XML changelogs, where XSD validation catches the error before we try to run anything, but other parsers like the YAML parser, and my beloved Groovy parser don't have the benefit of schema validation.

If I apply this changelog, Liquibase will run to completion and report success. I'll get a new column in my table, but it won't have a default value, and there will be no indication that anything was amiss. I think this is a problem because we reported success, but we did not do what the developer was intending or expecting. I don't think we should report success when we don't give the developer exactly what was requested. In this particular case, it could be a while before anyone notices the missing default, and by this time a lot of data could be in the database without it.

I think it is much better to fail the run as soon as we discover something in the ParsedNode data that we don't understand and let the developer know. In this case, we'd report that we don't know what to do with "defaultvalue" parameter of a "column" definition in changeSet 1. This lets the developer know there is a mistake, and gives him a chance to fix it before any actual work is done.

The Goovy parser used to deal with the lack of validation problem by using ObjectUtil to set properties. If it failed, I'd stop there and let the user know that the property was invalid and where I found it, but as we discovered in Issue 1931, this is no longer an option for me with Liquibase 3.2

Looking at the code, I think the underlying issue is that we load objects by looping through the object's valid parameters (via ChangeParameterMetaData), looking for ParsedNodes that contain that parameter's data. It ignores ParsedNodes that don't line up with a valid parameter. I think that a better approach would be to loop through the ParsedNodes and look for a matching parameter, failing immediately if one can't be found. We could even add a boolean flag to ChangeParameterMetaData to indicate when a parameter has already been processed if we wanted to catch repeats of a parameter in a change.

Similar changes would probably need to be made to other classes with "load" methods as well.

I really like the change to ParsedNode instances, but I feel like it would be a mistake to update the GroovyParser until I can keep the validation that it currently provides.

What are your thoughts?

Environment

All

Activity

Show:

Nathan Voxland
June 23, 2015 at 8:25 PM

Yes, I agree. They should all know, but I don't think the code as it is will be able to support that well. I'm working on a larger 4.0 refactoring to clean up a lot of things like this that isn't easy without a larger code change.

Steve Saliman
June 23, 2015 at 1:28 AM

I agree that a ParsedNode should not know or care about what constitutes a valid node, but each object that a ParsedNode represents must have a "load" method that does the job of converting ParsedNode objects into valid changes, column configs, etc. Each object can certainly know what ParsedNodes are valid for the object trying to load the ParsedNode.

My feeling is that not only can each object know what ParsedNodes are valid, but that each object should know what ParsedNodes are valid. I think that every attribute developers put into a changelog represents intent, and that we need to let them know if we cannot honor that intent.

As for extensions that add new attributed, wouldn't they either need to implement their own load method, or at least do something that would allow an object to know, at run time, what attributes are valid? I don't think it would be very hard to loop through each ParsedNode to see if any of them contain an attribute that isn't in the collection of valid attributes instead of just looping through attributes and ignoring anything it doesn't recognize.

Nathan Voxland
June 22, 2015 at 5:24 PM

I tried to add a check to ChangeSet but ran into problems because at the ParsedNode point, we don't know if a node was an attribute or a nested tag and so i can't even whitelist things like "id", "author" etc. and fail on other nodes because you can have changes and other objects inside changeSet which are valid but wouldn't be on the list.

We also need to balance helpful user errors with the ability for extensions to add new attributes that we don't know about at compile time.

Nathan Voxland
June 22, 2015 at 5:12 PM

I fixed the issue for change attributes and also ColumnConfig objects, which is the majority of what people will be using. You are right that the load() method needs to be modified for all object types. I'm going to keep the issue open but move it to 4.0 when I can make a more general-case fix.

Nathan Voxland
March 6, 2015 at 6:58 PM

Sorry, finally catching up on jira issues. I'm hoping to have a 3.4 release out in the next few weeks and will look at this as part of that.

Resize issue view side panel

Details

Reporter

Steve Saliman

Fix versions

4.0.0

Affects versions

3.2.0

Priority

Major

Created July 17, 2014 at 3:38 AM

Updated June 23, 2015 at 8:25 PM