Hrorm Forums

New Release 0.10.0


#1

Just pushed out version 0.10.0 of hrorm. Release notes:

  • Generic Column support added, user-defined column types
  • Better schema support
    • Unique constraints
    • Interface to override SQL types
  • Internals improved:
    • Simplified column collation
    • DAO builders independent but tested for matching methods

The generic column stuff is the biggest deal. Hrorm is now open to expansion by users without getting under the hood. I really should have done that a long time ago.

With that and the improved schema support, you can really customize both your database schema and your Java object model very easily.

But the improvements to the internals of hrorm actually took the longest. Some of the code was quite fragile and the way the different DAO builder classes interacted was pretty janky. That stuff is much cleaner now.

Looking at hrorm now, it’s time to start splitting up the code into packages. The org.hrorm package now has 56 classes. Looking at the docs is annoying. I am thinking about just adding an org.hrorm.internals package, to guide users towards the parts that matter and away from the parts they don’t need to care about. But that might not be the best. I always am a bit suspicious of “stay out of here” signs. Maybe something better will come to mind.


#2

I’ve had a chance to look at the code and I think the GenericColumn interface can be simplified, or probably better, wrapped in a way to make it easier to construct. I’ll give this a shot.


#3

Here’s the direction I’m going in:

(Not all types done this way yet)

Having every specificTypeColumn() method defer to genericColumn() is going to make you gravitate towards cleaner, more composite interfaces.

public static <ENTITY, BUILDER> AbstractColumn<String, ENTITY, BUILDER> stringColumn(
        String name, String prefix, Function<ENTITY, String> getter, BiConsumer<BUILDER, String> setter, boolean nullable) {
    GenericColumn<String> column = new GenericColumn<String>(JDBCInteraction.STRING, Types.VARCHAR, "text");
    return genericColumn(name, prefix, getter, setter, column, nullable);
}

Even that can be simplified. Additionally downstream in AbstractColumn, you can get rid of the duplicate sqlTypeName field, opting to pass on ColumnType and whatnot.

If eventually you further defer entity hydration concerns to later in the pipeline, AbnstractColumn itself starts looking redundant.

The downsides to this approach is also fairly evident- a lot of enumeration. Yikes at java.sql.Types not being an enum! They really didn’t make it easy to enforce something there, making you instead pass around ints.


#4

I was thinking along the same lines as you for the next release. It got so cheap to support all the built in types, you might as well just do it.

I think you need to have the silly java.sql.Types value as part of the JDBCInteraction though. That’s needed for properly setting null values. But perhaps that’s handled some other way? I have only glanced at the code.

I could not help but chuckle at your comment about java.sql.Types. The early days of Java are filled with painful code like that. Everyone was still a C programmer. :smiley:


#5

The types value- that’s difficult. I think it probably should be a part of JDBCInteraction, but I’d do:

JDBCInteraction.string(int type)
where:
JDBCInteraction.STRING = JDBCInteraction.string(Types.VARCHAR);

So that this can be easily overridden. This emphasizes the downsides to my approach- lots of this boilerplate code that should be written for us in the JDK.

The direction this seems to be going, adding theDevelopersPreferredTypeColumn(…) (ZonedDateTime, UUID, etc) by extending your DaoBuilder / factory interfaces looks like it might be really easy with the right tweaks.

If you like what you’re seeing and want me to continue with this branches work to its completion, let me know and I’ll try to finish it off with a PR by the end of the week.

I haven’t given up on the HRORM showcase app by the way- I had to upgrade HRORM on the other projects I’m working on, and I just couldn’t get CDI injection working on the latest versions of RESTEasy for some reason in the hrormchat project. I’m switching that over to Jersey to see if I can mitigate that. I’ve spent hours debugging RESTEasy’s bean management code with nothing to show for it…


#6

I’ve further fleshed this out, again alof of boilerplate. Alot of tests fail when using stringColumn() -> genericColumn override because of:

        @Override
        public Set<Integer> supportedTypes() {
            return Collections.singleton(genericColumn.sqlType());
        }

Exception:

org.hrorm.HrormException: Column name does not support type 2005
        at org.hrorm.Validator.validate(Validator.java:45)
	at org.hrorm.ComplexTest.testDaoValidation(ComplexTest.java:178)
 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Reason: I probably need to pass supportedTypes along with ColumnType, then my instinct would say that if sqlTypeName is specified, we need to use Collections.singleton like above and throw the exception shown. If no sqlTypeName is specified, we should default to something for Schema to be able to do its work, and pass all supported types out here.

Is it actually a failure scenario if our descriptor was created with VARCHAR but JDBC tells us the datastore is using CLOB?

shrug.

Validation may need to change… but I’m hesitant to just push forward with something without a little more thought/discussion on it. At the very least the error content should change to suggest a solution of some type, or more specifically identify the problem to the developer in a way they can solve it without spelunking into HRORM.


#7

For the time being, I did this to get the tests over the line. May or may not be the right approach.

Again, the question remains whether this is an error condition or not. DataColumnFactory is not the right place to define the default type names, either (“text”, “integer”).


#8

I think the basic ideas here are right, but am still pondering some of the details.

  • Not sure why you got rid of the GenericColumnWrapper. That seemed right to me. I think that gets rid of most of the code in the DataColumnFactory. Maybe that can disappear altogether eventually.
  • I think the ColumnType class has merit. Clearly the string schema representation and the Java int value are related. The problem is that sometimes you want to change the schema/string value, but leave the int alone. So, I’m 50/50 on this.
  • I am not sure what separating the GenericColumn/JDBCInteraction into two classes buys. The GenericColumn has only one member variable, and the JDBCInteraction only exists in one place. I think this can be collapsed back into one class.
  • I think providing all the types to the user out-of-the-box is right, and we should keep that.
  • Not sure what the validation issue is with Clob/string/whatever. If it works than the validator should allow it, I think.

#9

GenericColumnWrapper had no code… it didn’t do anything. What were you thinking of here?

I noticed that JDBCInteraction can replace GenericColumn altogether if I move the rest of the functionality over. I’m not stuck on any name for the class, it can be named GenericColumn if you’d like.

The biggest problem with ColumnType: We’re leaving orbit (ANSI SQL). We can either require/provide the user only ANSI SQL types or we’re likely heading into Dialect territory:

TimestampType.mssqlDateTime2(); // Calls TimestampType.TIMESTAMP("datetime2");

I do like giving the user type options in a neat package, as it also serves as an example for do-it-yourselfers. JDBCInteraction doesn’t prohibit one from customization, but it does (currently) prohibit one from using ResultSet:getOneType and PreparedStatement:setAnother, I’ve been waiting for your reaction to that.

Validation: You answered my question. If I define a DaoDescriptor with a varchar field and the table’s column is really a clob, that’s not an error scenario. Feature, not a bug.


#10

I thought with the correct wrapper code, you could implement everything as a GenericColumn and get rid of the DataColumnFactory and even the AbstractColumn. There might be some pitfalls to that plan. I haven’t tried it out.

I think I like the name GenericColumn better than JDBCInteraction. But I was a bit sheepish about the name GenericColumn. It is a bit confusing, since it’s not actually a Column in the hrorm sense.

I am not sure I see how we’re leaving ANSI SQL yet. But, yeah, if we are, I am against that.

I think the purpose of hrorm’s validation is to validate hrorm. It’s not really saying that your schema is perfectly designed or that all the types you are using are the most preferred. It’s saying: can I transform values from the columns you have to the object model you have. That’s it.


#11

Upgrading all the projects to hrorm 0.10.0 today, later on I’ll do what I can to finish this branch off. Been a busy week for me.

I’ll give this direction a shot.

Well, JDBCInteraction is kinda, ugh, as well. I’ll make this GenericColumn for right now.

We’ve kind of accidentally stumbled upon supporting dialects with the ColumnType subclasses. We’ll provide methods for providing ANSI SQL types, but there’s nothing stopping someone from making a MSSQLTimestampType subclass (of TimestampType), for example, and using that to make MSSQL-specific ‘datetime2’ JDBCInteractions.

I guess what I’m getting at here is, you might eventually get a feature request for something dialect-specific now that this machinery is getting exposed. If you’re feeling nice and want to support those requests, I’d suggest making a separate pom per dialect (eg, hrorm-dialect-postgres).


#12

I hope that upgrading is not too much trouble. I know some things have changed, but I think most of hrorm has been stable. For some value of stable.

As far as getting feature requests, any request is a good request. At least until hrorm gets a lot more popular than it is today. :smiley:


#13

I am pretty far along with trying out my idea of replacing everything with the GenericColumn.

I hope you haven’t been spent a lot of time on this.

I think it will turn out better and make the extensions a first-class citizen.


#14

Release 0.10.2 is now out. It follows through on some of the ideas we have been discussing here.

The internals of all the data column types are now implemented as GenericColumn for the various T’s. There are now some static instances of various other columns: int, float, double, and a couple of others. Putting in new instances is basically trivial if we want to support CLOBs or whatever. But of course, it’s pretty trivial for hrorm users to do it for themselves.

The best change is that you can now use GenericColumns in Where clauses. That turned out to be very easy to add. There is even more clean up to do where a little bit of excess code could be cleaned out, but it’s not really a big deal one way or another.


#15

I’ll take a look at 0.10.2. I need to be a little better with following through on some of this stuff, I’ve just had priorities at home and at work that leave me with little left in terms of time and energy.

Upgrading is unpleasant any time any project changes its publicly exposed interfaces. Just the nature of the beast. But I’ll say- each time I’ve upgraded HRORM in an environment, it generally means removing a bunch of code and cleaning up messes from awkward interactions in or out of the library earlier.

My gripes from when I first talked with you have largely been resolved, the top three remaining are:

  1. I like Streams as a possible “hrorm Cursor” and the possibilities it opens. foldingSelect means I don’t have to fully buffer lists of results, so my complaints here are merely personal opinion. If I really wanted it, I’d make “StORM” (Streaming ORM) and do the work to maintain it =).
  2. You say “Integer”… but don’t mean it. Maybe consider .withLongColumn for that use case?
  3. Lack of support for IN is the primary functional shortcoming from my perspective, but this could make query generation really tricky. or() seems to be lower hanging fruit. Either way, supporting batches of possible values for a field would be pretty awesome if isn’t too intrusive.

#16

Lots of things to reply to here. :smiley:

Hrorm is pretty opinionated about how to design a domain model. I admit that.

There are probably four things that you want in a model:

  1. An integral type
  2. A fractional type
  3. A string type
  4. A date/time type

(And of course a way to build compound objects and lists of objects. Hrorm supports that.)

Hrorm also adds

  1. A boolean type
  2. Mechanisms for supporting enumerated types

So, what were the choices here?

  1. The most natural thing for an integral type is probably int/Integer. But in the predominantly 64 bit world we live in today, ints are just too small. So I went with long. Really, using an int is kind of an archaic pessimisation at this point. I am trying to train myself to just type long every time, but I fail about … 95% of the time.
  2. Floats and doubles are great for fast arithmetic and save object overhead compared to BigDecimals, but for anything you want to persist, the problems of floating point conversions are just way too painful and mean you have to think too hard all the time. Hrorm really wants to guide you to use BigDecimal. (Incidentally, this is one nice thing about C#, there’s a built-in decimal value type.)
  3. Well, yeah, this just has to be String in Java. I think no one would disagree with that.
  4. I definitely wanted to use one of the modern java.time classes, not java.sql.Timestamp or java.util.Date or any of those badly out of date classes. As has been discussed, I wrongly picked LocalDateTime, since I did not understand the problems with it. Changing that to Instant was a pain, but it was the right thing to do.
  5. Nothing really to say about booleans. Java gives you a boolean type, and if you’re going to make that a first class concept, well, there it is.
  6. The Converter concept is a bit of an annoyance, but I still have not thought of anything better. It’s the one place where hrorm requires the client to implement one of its interfaces. It’s an easy interface to implement, but still … it’s an imposition. The alternative I considered is to just have you provide two functions: Function<THING, CODE> and Function<CODE, THING>, but that’s potentially even more annoying, and makes one more argument to methods that already take 4 arguments. I try to max out at 4 arguments, and most of hrorm’s public interfaces do so. Obviously, 2 or 3 is even better. Also, hrorm has the opinion that you should record strings for enumerations, it makes that easier than using integers (either longs or ints), since that makes the DB itself more readable. That’s another reason why hrorm originally made the default persistence of booleans “T” and “F”, but you convinced me that was the wrong default.

Well, anyway, that’s a long way around to explain one thing: why hrorm says “integer” where it actually means “long”.

I have been thinking about changing it. Especially now that the GenericColumn.Integer has been added, it’s even more confusing.

But, as you pointed out, hrorm has made a series of changes to its public interfaces, and that’s just one more. It is probably the right thing to do though.


#17

So, I like streams to. But I hate laziness in this case.

(Reminder: https://github.com/ojplg/hrorm/pull/4#issuecomment-459982010)

I wrote a bunch of Clojure at one time that used a DB library that returned lazy sequences for returning things from the DB. And you always had to do one of two things:

  1. Force the sequence to be realized and then close the connection and then return
  2. Write your code inside-out: pass the processor of the results to the database requesting method

Well, I am a well known idiot, but I was constantly screwing it up and having to rewrite things.

Hrorm is simple. You either:

  1. Get a fully instantiated List, or
  2. Use the folding mechanism that forces you through types to write your code inside out

To me, it’s the difference between “possible to use correctly” and “difficult to screw up”.

If there’s a way to use streams that’s difficult to screw up, I would certainly consider it, but so far, I have not seen it.


#18

The IN/NOT IN operators are annoying. I just don’t see a good way to support them yet.

I guess you could count the number of things in the list, generate the PreparedStatement with the appropriate number of “?” and then iterate and bind all the variables? I dunno … maybe that’s not so ugly. Could be worth trying out.

But some part of me just feels like IN/NOT IN is a bit of a cop-out. It’s like “I have not normalized my data sufficiently to be able to express in some logical way what I am looking for, so just look for this random collection of things that I just globbed together.”

Though, it’s possible that’s just a rationalization for not having thought of a good way to support this. :smiley:


#19

Well, my PR had all the stuff there to properly close any resources in the case of an exception, so I’ll go into this a bit…

I think “difficult to screw up” is not an accurate descriptor of the core issue in question. At the very least our ideas of what makes something “difficult to screw up” differs.

The current standard as I understand it is no laziness, or, “HRORM is done at the end of the method call”. Given that limitation, it is not possible to implement Stream in a way other than taking your fully buffered list and calling .stream() on it, which would be meaningless.

However, from my perspective dao.selectAll() especially, but really any of your List-return methods- makes it relatively easy to “screw up” as you put it, because you’re making the assumption the result set fits into memory when there’s no guarantee it does.

Then we have to ask, is JDBC’s ResultSet easy, or difficult to screw up? Yes the JDBC API is a little low-levelish, unwieldy and somewhat archaic, and arguably its one of the worst things I could invoke to bolster my point, but ResultSet is essentially a Cursor and doesn’t pretend to be something else. Streams are not Lists, nor are aimed at replacing Lists. Streams are not designed to be consumed multiple times and there are no assumptions around length, which sounds familiar, does it not?

If someone takes a return Stream, then immediately calls connection.close() or stream.collect(Collectors.toList()) on that using an oversized result set, that isn’t me making it easy for them to screw up, that’s them going out of their way to screw up. And foldingSelect does have some similar problems to Stream in this ways- yes, there’s the method boundary where “HRORM is done”, but typically you’re passing in a Lambda or Method Reference just the same as you would be in Stream.map or Stream.collect, and until the iteration completes, HRORM is not done. Errors can just as easily happen in the fold’s iteration as with Stream.map().collect().

What if I construct a Dao, immediately close the connection, and return the Dao as an object in a helper method call? That’s not difficult. Making it difficult, would be forcing the developer to provide a JDBC connection string, or a Supplier for Connection instead, and HRORM managing the connection, opening it when convenient, closing it when convenient, implementing AutoCloseable and whatnot.

So where then, is the additional risk in returning a Stream over foldingSelect? Connection failures in the space between Stream generation (method boundary) and consumption? Connection-killers can still happen during the foldingSelect call- even if you’re assuming a single-threaded environment. What about if I go and do something like hold onto that Stream until (much) later or try to pass it over a Serialization boundary (Streams are not serializable)? Did you really make it easy for me to screw up, or am I going out of my way to screw up when I cram that Stream into a cache or map and save it for later without consuming it (instead of doing what’s right and putting the Criteria or Wheres into a cache, given the Stream is lazy to begin with)?

Personally, I’d go further than “supporting” Stream and instead mandate it, no List<> methods at all- force the implementer to deal with unknown lengths of data and the one-use nature of Stream, since that is the proper way to interact with databases anyway. If their application requires making big Lists or Sets, have them issue a collect themselves and go out of their way to fail by filling the heap with objects or close a connection right after generating the entity Stream that has yet to be consumed. If they’re truly stunned by the resulting failures they have more issues than the usability of their ORM solution.

Yikes. Reading the above, it does indeed largely boil down to the “implementer should know better” argument, but I think there’s a good point or two there. I do think its fair to say that “difficult to screw up” isn’t the exactly the right terminology here, we’re talking about something at little more specific that has to do with design decisions. And again, this is all really my opinion, and that’s talk. HRORM is the result of action, and the current approach is done by design. If I really wanted it, I’d be willing to go and maintain it. If you implemented this, you would have to maintain it. You would own it.

I’ll pipe down on this a bit since I’ll admit, I’ve been rather obnoxious about it. Appreciate the willingness to hash this out in discussion.


#20

If you post a link to your implementation of streams, I will take another look. I do not recall exactly how you had it working.

But, I am just going to take a look and think about things. No promises. I think we are just going to have to agree to disagree.

Like I said, I’m not against streams generally, or even laziness generally, it’s that I think this is a serious pitfall for hrorm users. Obviously “better” developers with more understanding of streams are less likely to hurt themselves, but since that includes me hrorm has to be dumbed down.

:smiley: