Hrorm Forums

kazetsukai · 2019-04-03 15:19:49 UTC

I’ve further fleshed this out, again alof of boilerplate. Alot of tests fail when using stringColumn() -> genericColumn override because of:

        @Override
        public Set<Integer> supportedTypes() {
            return Collections.singleton(genericColumn.sqlType());
        }

Exception:

org.hrorm.HrormException: Column name does not support type 2005
        at org.hrorm.Validator.validate(Validator.java:45)
	at org.hrorm.ComplexTest.testDaoValidation(ComplexTest.java:178)
 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Reason: I probably need to pass supportedTypes along with ColumnType, then my instinct would say that if sqlTypeName is specified, we need to use Collections.singleton like above and throw the exception shown. If no sqlTypeName is specified, we should default to something for Schema to be able to do its work, and pass all supported types out here.

Is it actually a failure scenario if our descriptor was created with VARCHAR but JDBC tells us the datastore is using CLOB?

shrug.

Validation may need to change… but I’m hesitant to just push forward with something without a little more thought/discussion on it. At the very least the error content should change to suggest a solution of some type, or more specifically identify the problem to the developer in a way they can solve it without spelunking into HRORM.

kazetsukai · 2019-04-03 15:34:11 UTC

For the time being, I did this to get the tests over the line. May or may not be the right approach.

Again, the question remains whether this is an error condition or not. DataColumnFactory is not the right place to define the default type names, either (“text”, “integer”).

ojplg · 2019-04-03 22:10:32 UTC

I think the basic ideas here are right, but am still pondering some of the details.

Not sure why you got rid of the GenericColumnWrapper. That seemed right to me. I think that gets rid of most of the code in the DataColumnFactory. Maybe that can disappear altogether eventually.
I think the ColumnType class has merit. Clearly the string schema representation and the Java int value are related. The problem is that sometimes you want to change the schema/string value, but leave the int alone. So, I’m 50/50 on this.
I am not sure what separating the GenericColumn/JDBCInteraction into two classes buys. The GenericColumn has only one member variable, and the JDBCInteraction only exists in one place. I think this can be collapsed back into one class.
I think providing all the types to the user out-of-the-box is right, and we should keep that.
Not sure what the validation issue is with Clob/string/whatever. If it works than the validator should allow it, I think.

kazetsukai · 2019-04-04 00:15:54 UTC

GenericColumnWrapper had no code… it didn’t do anything. What were you thinking of here?

I noticed that JDBCInteraction can replace GenericColumn altogether if I move the rest of the functionality over. I’m not stuck on any name for the class, it can be named GenericColumn if you’d like.

The biggest problem with ColumnType: We’re leaving orbit (ANSI SQL). We can either require/provide the user only ANSI SQL types or we’re likely heading into Dialect territory:

TimestampType.mssqlDateTime2(); // Calls TimestampType.TIMESTAMP("datetime2");

I do like giving the user type options in a neat package, as it also serves as an example for do-it-yourselfers. JDBCInteraction doesn’t prohibit one from customization, but it does (currently) prohibit one from using ResultSet:getOneType and PreparedStatement:setAnother, I’ve been waiting for your reaction to that.

Validation: You answered my question. If I define a DaoDescriptor with a varchar field and the table’s column is really a clob, that’s not an error scenario. Feature, not a bug.

ojplg · 2019-04-04 16:53:49 UTC

I thought with the correct wrapper code, you could implement everything as a GenericColumn and get rid of the DataColumnFactory and even the AbstractColumn. There might be some pitfalls to that plan. I haven’t tried it out.

I think I like the name GenericColumn better than JDBCInteraction. But I was a bit sheepish about the name GenericColumn. It is a bit confusing, since it’s not actually a Column in the hrorm sense.

I am not sure I see how we’re leaving ANSI SQL yet. But, yeah, if we are, I am against that.

I think the purpose of hrorm’s validation is to validate hrorm. It’s not really saying that your schema is perfectly designed or that all the types you are using are the most preferred. It’s saying: can I transform values from the columns you have to the object model you have. That’s it.

kazetsukai · 2019-04-08 14:23:21 UTC

Upgrading all the projects to hrorm 0.10.0 today, later on I’ll do what I can to finish this branch off. Been a busy week for me.

I’ll give this direction a shot.

Well, JDBCInteraction is kinda, ugh, as well. I’ll make this GenericColumn for right now.

We’ve kind of accidentally stumbled upon supporting dialects with the ColumnType subclasses. We’ll provide methods for providing ANSI SQL types, but there’s nothing stopping someone from making a MSSQLTimestampType subclass (of TimestampType), for example, and using that to make MSSQL-specific ‘datetime2’ JDBCInteractions.

I guess what I’m getting at here is, you might eventually get a feature request for something dialect-specific now that this machinery is getting exposed. If you’re feeling nice and want to support those requests, I’d suggest making a separate pom per dialect (eg, hrorm-dialect-postgres).

ojplg · 2019-04-09 02:40:13 UTC

I hope that upgrading is not too much trouble. I know some things have changed, but I think most of hrorm has been stable. For some value of stable.

As far as getting feature requests, any request is a good request. At least until hrorm gets a lot more popular than it is today.

ojplg · 2019-04-12 18:46:58 UTC

I am pretty far along with trying out my idea of replacing everything with the GenericColumn.

I hope you haven’t been spent a lot of time on this.

I think it will turn out better and make the extensions a first-class citizen.

ojplg · 2019-04-13 03:55:59 UTC

Release 0.10.2 is now out. It follows through on some of the ideas we have been discussing here.

The internals of all the data column types are now implemented as GenericColumn for the various T’s. There are now some static instances of various other columns: int, float, double, and a couple of others. Putting in new instances is basically trivial if we want to support CLOBs or whatever. But of course, it’s pretty trivial for hrorm users to do it for themselves.

The best change is that you can now use GenericColumns in Where clauses. That turned out to be very easy to add. There is even more clean up to do where a little bit of excess code could be cleaned out, but it’s not really a big deal one way or another.

kazetsukai · 2019-04-14 01:42:46 UTC

I’ll take a look at 0.10.2. I need to be a little better with following through on some of this stuff, I’ve just had priorities at home and at work that leave me with little left in terms of time and energy.

Upgrading is unpleasant any time any project changes its publicly exposed interfaces. Just the nature of the beast. But I’ll say- each time I’ve upgraded HRORM in an environment, it generally means removing a bunch of code and cleaning up messes from awkward interactions in or out of the library earlier.

My gripes from when I first talked with you have largely been resolved, the top three remaining are:

I like Streams as a possible “hrorm Cursor” and the possibilities it opens. foldingSelect means I don’t have to fully buffer lists of results, so my complaints here are merely personal opinion. If I really wanted it, I’d make “StORM” (Streaming ORM) and do the work to maintain it =).
You say “Integer”… but don’t mean it. Maybe consider .withLongColumn for that use case?
Lack of support for IN is the primary functional shortcoming from my perspective, but this could make query generation really tricky. or() seems to be lower hanging fruit. Either way, supporting batches of possible values for a field would be pretty awesome if isn’t too intrusive.

ojplg · 2019-04-14 03:04:15 UTC

Lots of things to reply to here.

Hrorm is pretty opinionated about how to design a domain model. I admit that.

There are probably four things that you want in a model:

An integral type
A fractional type
A string type
A date/time type

(And of course a way to build compound objects and lists of objects. Hrorm supports that.)

Hrorm also adds

A boolean type
Mechanisms for supporting enumerated types

So, what were the choices here?

The most natural thing for an integral type is probably int/Integer. But in the predominantly 64 bit world we live in today, ints are just too small. So I went with long. Really, using an int is kind of an archaic pessimisation at this point. I am trying to train myself to just type long every time, but I fail about … 95% of the time.
Floats and doubles are great for fast arithmetic and save object overhead compared to BigDecimals, but for anything you want to persist, the problems of floating point conversions are just way too painful and mean you have to think too hard all the time. Hrorm really wants to guide you to use BigDecimal. (Incidentally, this is one nice thing about C#, there’s a built-in decimal value type.)
Well, yeah, this just has to be String in Java. I think no one would disagree with that.
I definitely wanted to use one of the modern java.time classes, not java.sql.Timestamp or java.util.Date or any of those badly out of date classes. As has been discussed, I wrongly picked LocalDateTime, since I did not understand the problems with it. Changing that to Instant was a pain, but it was the right thing to do.
Nothing really to say about booleans. Java gives you a boolean type, and if you’re going to make that a first class concept, well, there it is.
The Converter concept is a bit of an annoyance, but I still have not thought of anything better. It’s the one place where hrorm requires the client to implement one of its interfaces. It’s an easy interface to implement, but still … it’s an imposition. The alternative I considered is to just have you provide two functions: Function<THING, CODE> and Function<CODE, THING>, but that’s potentially even more annoying, and makes one more argument to methods that already take 4 arguments. I try to max out at 4 arguments, and most of hrorm’s public interfaces do so. Obviously, 2 or 3 is even better. Also, hrorm has the opinion that you should record strings for enumerations, it makes that easier than using integers (either longs or ints), since that makes the DB itself more readable. That’s another reason why hrorm originally made the default persistence of booleans “T” and “F”, but you convinced me that was the wrong default.

Well, anyway, that’s a long way around to explain one thing: why hrorm says “integer” where it actually means “long”.

I have been thinking about changing it. Especially now that the GenericColumn.Integer has been added, it’s even more confusing.

But, as you pointed out, hrorm has made a series of changes to its public interfaces, and that’s just one more. It is probably the right thing to do though.

ojplg · 2019-04-14 03:23:21 UTC

So, I like streams to. But I hate laziness in this case.

(Reminder: https://github.com/ojplg/hrorm/pull/4#issuecomment-459982010)

I wrote a bunch of Clojure at one time that used a DB library that returned lazy sequences for returning things from the DB. And you always had to do one of two things:

Force the sequence to be realized and then close the connection and then return
Write your code inside-out: pass the processor of the results to the database requesting method

Well, I am a well known idiot, but I was constantly screwing it up and having to rewrite things.

Hrorm is simple. You either:

Get a fully instantiated List, or
Use the folding mechanism that forces you through types to write your code inside out

To me, it’s the difference between “possible to use correctly” and “difficult to screw up”.

If there’s a way to use streams that’s difficult to screw up, I would certainly consider it, but so far, I have not seen it.

ojplg · 2019-04-14 03:28:43 UTC

The IN/NOT IN operators are annoying. I just don’t see a good way to support them yet.

I guess you could count the number of things in the list, generate the PreparedStatement with the appropriate number of “?” and then iterate and bind all the variables? I dunno … maybe that’s not so ugly. Could be worth trying out.

But some part of me just feels like IN/NOT IN is a bit of a cop-out. It’s like “I have not normalized my data sufficiently to be able to express in some logical way what I am looking for, so just look for this random collection of things that I just globbed together.”

Though, it’s possible that’s just a rationalization for not having thought of a good way to support this.

kazetsukai · 2019-04-15 15:58:51 UTC

Well, my PR had all the stuff there to properly close any resources in the case of an exception, so I’ll go into this a bit…

I think “difficult to screw up” is not an accurate descriptor of the core issue in question. At the very least our ideas of what makes something “difficult to screw up” differs.

The current standard as I understand it is no laziness, or, “HRORM is done at the end of the method call”. Given that limitation, it is not possible to implement Stream in a way other than taking your fully buffered list and calling .stream() on it, which would be meaningless.

However, from my perspective dao.selectAll() especially, but really any of your List-return methods- makes it relatively easy to “screw up” as you put it, because you’re making the assumption the result set fits into memory when there’s no guarantee it does.

Then we have to ask, is JDBC’s ResultSet easy, or difficult to screw up? Yes the JDBC API is a little low-levelish, unwieldy and somewhat archaic, and arguably its one of the worst things I could invoke to bolster my point, but ResultSet is essentially a Cursor and doesn’t pretend to be something else. Streams are not Lists, nor are aimed at replacing Lists. Streams are not designed to be consumed multiple times and there are no assumptions around length, which sounds familiar, does it not?

If someone takes a return Stream, then immediately calls connection.close() or stream.collect(Collectors.toList()) on that using an oversized result set, that isn’t me making it easy for them to screw up, that’s them going out of their way to screw up. And foldingSelect does have some similar problems to Stream in this ways- yes, there’s the method boundary where “HRORM is done”, but typically you’re passing in a Lambda or Method Reference just the same as you would be in Stream.map or Stream.collect, and until the iteration completes, HRORM is not done. Errors can just as easily happen in the fold’s iteration as with Stream.map().collect().

What if I construct a Dao, immediately close the connection, and return the Dao as an object in a helper method call? That’s not difficult. Making it difficult, would be forcing the developer to provide a JDBC connection string, or a Supplier for Connection instead, and HRORM managing the connection, opening it when convenient, closing it when convenient, implementing AutoCloseable and whatnot.

So where then, is the additional risk in returning a Stream over foldingSelect? Connection failures in the space between Stream generation (method boundary) and consumption? Connection-killers can still happen during the foldingSelect call- even if you’re assuming a single-threaded environment. What about if I go and do something like hold onto that Stream until (much) later or try to pass it over a Serialization boundary (Streams are not serializable)? Did you really make it easy for me to screw up, or am I going out of my way to screw up when I cram that Stream into a cache or map and save it for later without consuming it (instead of doing what’s right and putting the Criteria or Wheres into a cache, given the Stream is lazy to begin with)?

Personally, I’d go further than “supporting” Stream and instead mandate it, no List<> methods at all- force the implementer to deal with unknown lengths of data and the one-use nature of Stream, since that is the proper way to interact with databases anyway. If their application requires making big Lists or Sets, have them issue a collect themselves and go out of their way to fail by filling the heap with objects or close a connection right after generating the entity Stream that has yet to be consumed. If they’re truly stunned by the resulting failures they have more issues than the usability of their ORM solution.

Yikes. Reading the above, it does indeed largely boil down to the “implementer should know better” argument, but I think there’s a good point or two there. I do think its fair to say that “difficult to screw up” isn’t the exactly the right terminology here, we’re talking about something at little more specific that has to do with design decisions. And again, this is all really my opinion, and that’s talk. HRORM is the result of action, and the current approach is done by design. If I really wanted it, I’d be willing to go and maintain it. If you implemented this, you would have to maintain it. You would own it.

I’ll pipe down on this a bit since I’ll admit, I’ve been rather obnoxious about it. Appreciate the willingness to hash this out in discussion.

ojplg · 2019-04-15 17:40:31 UTC

If you post a link to your implementation of streams, I will take another look. I do not recall exactly how you had it working.

But, I am just going to take a look and think about things. No promises. I think we are just going to have to agree to disagree.

Like I said, I’m not against streams generally, or even laziness generally, it’s that I think this is a serious pitfall for hrorm users. Obviously “better” developers with more understanding of streams are less likely to hurt themselves, but since that includes me hrorm has to be dumbed down.

ojplg · 2019-04-15 17:42:34 UTC

Also, you have not been obnoxious at all. Just giving your opinion about how things should work in a fair and productive way.

At hrorm.org, we welcome criticism. All one of us.

kazetsukai · 2019-04-15 18:33:11 UTC

I’ll try and update it to 0.10.2 tonight and link it here… it was a rather trivial add IIRC.

We’re all adults here, again I’m admitting I’ve been obnoxious about it.

And don’t feel too sorry for me now- I can probably wrap around foldingSelect to get an entity Stream with some magic Consumer like a BlockingQueue and make that into a utility method somewhere- since I’m working in multi-threaded environments anyway.

Update: I wrote this out of curiosity for how a BlockingQueue solution would work. For a first pass it isn’t entirely awful.

public <E> Stream<E> foldingStream(KeylessDao<E> dao, Where where) {
        // I'd use the container's ManagedExecutorService here.
	Executor single = Executors.newSingleThreadExecutor(); 
	final boolean[] done = { false };
	final BlockingQueue<E> queue =new LinkedBlockingQueue<>(1);
	((ExecutorService) single).submit(() -> {
		dao.foldingSelect(queue, (q, e) -> {
			try {
				q.put(e);
				return q;
			} catch (InterruptedException e1) {
				throw new RuntimeException(e1);
			}
		}, where);
		done[0] = true;
	});

	return StreamSupport.stream(
			Spliterators.spliteratorUnknownSize(new Iterator<E>() {
				@Override
				public boolean hasNext() {
					return !queue.isEmpty() || !done[0];
				}

				@Override
				public E next() {
					try {
						return queue.take();
					} catch (InterruptedException e) {
						throw new RuntimeException(e);
					}
				}
			}, Spliterator.ORDERED)
	, false);
}

kazetsukai · 2019-04-15 19:47:25 UTC

Also I think this is something we could address, and I was reminded of it by one of your previous arguments vs Streams- what do we do in the case of connection pooling, etc.

It might be a good idea to take a connection “Supplier” (https://docs.oracle.com/javase/7/docs/api/javax/sql/DataSource.html looks like the winner actually) rather than a connection, and it might be a good idea to manage and watch the connection inside of HRORM itself. Then any code that utilizes a connection would have to call something like:

public void getConnection(Consumer<Connection> connectionConsumer) {
	DataSource dataSource = getDatasource();
        // Any retries / fault tolerances would have to go here.
        // Try -> Fail -> Rollback TX -> Try again
	try (Connection connection = dataSource.getConnection()) {
		connectionConsumer.accept(connection);
	} catch (SQLException e) {
		throw new HrormException(e);
	}
}

Switching to DataStore enables connection pooling support with C3P0 (http://www.mchange.com/projects/c3p0/index.html) and DBCP (http://commons.apache.org/dbcp/)

Edit 3:
Oooh, look at this!
https://docs.oracle.com/javase/8/docs/api/java/sql/JDBCType.html

ojplg · 2019-04-15 23:15:29 UTC

Your solution looks much cleaner than mine.

gist.github.com

https://gist.github.com/ojplg/7d36380e469997c9090dd079e0d7c0a5

StreamSelector.java

package org.hrorm;

import java.util.Spliterators;
import java.util.function.BiFunction;
import java.util.function.Consumer;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;

public class StreamSelector<ENTITY> {

This file has been truncated. show original

But really, I think both of these lead me to think it’s pretty hard to do it outside of hrorm itself.

ojplg · 2019-04-15 23:19:41 UTC

Regarding the DataSource or Supplier<Connection> have you taken a look at the Transactor class? It somewhat does what you are talking about. Extending it to use a DataSource should be quite simple.

Also, I do not think this is an either/or question. Why not have the DaoBuilder objects construct a Dao from all of the above?

New Release 0.10.0