tgies hates software: XML Schemas: Some Ground Rules

From: tgies
Date: 12:32 on 27 Sep 2007
Subject: XML Schemas: Some Ground Rules

Attention jerks,
Okay, so you're going to use XML for every imaginable thing which you
can possibly contrive a way to use XML for, including uncompressed RGB
raster images, large relational databases, and the syntax for new
procedural imperative programming languages. Fine. Fine. I suppose I
can't stop you. After all, it is somehow considered new and it is
somehow considered Web 2.0 and there is an X in the name.

But, for pity's sake, if you are going to define a new XML schema,
please do not be a complete retard. XML has these things called
"attributes" for a reason. You do not need to define a separate tag
for every possible property an element in your XML document may have.
There is a reason that it is <font color="#FF0000" face="Comic Sans
MS">Hello</font> and not
<font>
  <color>#FF0000</color>
  <face>Comic Sans MS</face>
  <text>Hello</text>
</font>. Oh my God.

From: Michael G Schwern
Date: 13:02 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

tgies wrote:
> Attention jerks,
> Okay, so you're going to use XML for every imaginable thing which you
> can possibly contrive a way to use XML for, including uncompressed RGB
> raster images, large relational databases, and the syntax for new
> procedural imperative programming languages. Fine. Fine. I suppose I
> can't stop you. After all, it is somehow considered new and it is
> somehow considered Web 2.0 and there is an X in the name.
> 
> But, for pity's sake, if you are going to define a new XML schema,
> please do not be a complete retard. XML has these things called
> "attributes" for a reason. You do not need to define a separate tag
> for every possible property an element in your XML document may have.
> There is a reason that it is <font color="#FF0000" face="Comic Sans
> MS">Hello</font> and not
> <font>
>   <color>#FF0000</color>
>   <face>Comic Sans MS</face>
>   <text>Hello</text>
> </font>. Oh my God.

I would like, at this point, to pimp YAML a little.

font:
  color: #FF0000
  rgb:   [255,0,0]
  face:  Comic Sans MS
  text:  >-
Oh hey, look.  It's a human
readable block of soft-wrapped
text!

From: Michael G Schwern
Date: 13:11 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

Michael G Schwern wrote:
> tgies wrote:
>> Attention jerks,
>> Okay, so you're going to use XML for every imaginable thing which you
>> can possibly contrive a way to use XML for, including uncompressed RGB
>> raster images, large relational databases, and the syntax for new
>> procedural imperative programming languages. Fine. Fine. I suppose I
>> can't stop you. After all, it is somehow considered new and it is
>> somehow considered Web 2.0 and there is an X in the name.
>>
>> But, for pity's sake, if you are going to define a new XML schema,
>> please do not be a complete retard. XML has these things called
>> "attributes" for a reason. You do not need to define a separate tag
>> for every possible property an element in your XML document may have.
>> There is a reason that it is <font color="#FF0000" face="Comic Sans
>> MS">Hello</font> and not
>> <font>
>>   <color>#FF0000</color>
>>   <face>Comic Sans MS</face>
>>   <text>Hello</text>
>> </font>. Oh my God.
> 
> I would like, at this point, to pimp YAML a little.
> 
> font:
>   color: #FF0000
>   rgb:   [255,0,0]
>   face:  Comic Sans MS
>   text:  >-
> Oh hey, look.  It's a human
> readable block of soft-wrapped
> text!

I would like, at this point, to pimp proper YAML.

font:
  color: #FF0000
  rgb:   [255,0,0]
  face:  Comic Sans MS
  text:  >-
    Oh hey, look.  It's a human
    readable block of soft-wrapped
    text!

I would also like, at this point, to beg mercy for that last reply lacking in
hate because it's 5am and I didn't even realize which mailing list I was
replying to.

So umm... boy I sure do hate, err-- firewall... web app... mouse clicky thing
with the command line completion done wrong by AT&T with Apple tie in and all
has to run on Windows 95 and wasn't everything better when it was statically
linked?!

Yeah.

*thunk*zzzzzzzzzzzzzzzzzzz

From: Tony Finch
Date: 19:10 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On Thu, 27 Sep 2007, Michael G Schwern wrote:
>
> I would like, at this point, to pimp YAML a little.

Not at all over-engineered!

Tony.

From: Jarkko Hietaniemi
Date: 19:15 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 9/27/07, Tony Finch <dot@xxxxx.xx> wrote:
> On Thu, 27 Sep 2007, Michael G Schwern wrote:
> >
> > I would like, at this point, to pimp YAML a little.
>
> Not at all over-engineered!
>

Ingy tried to make XML refugees feel at home.

From: Adam Atlas
Date: 19:26 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules


On 27 Sep 2007, at 14:10, Tony Finch wrote:

> On Thu, 27 Sep 2007, Michael G Schwern wrote:
>>
>> I would like, at this point, to pimp YAML a little.
>
> Not at all over-engineered!

Also I'm wary of any technology that calls itself "yet another"  
something.

(Yeah yeah, I know YAML officially stands for "YAML Ain't a Markup  
Language", but it used to be "Yet Another...", until they realized  
that it wasn't a markup language.)

From: Michael G Schwern
Date: 00:06 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

Tony Finch wrote:
> On Thu, 27 Sep 2007, Michael G Schwern wrote:
>> I would like, at this point, to pimp YAML a little.
> 
> Not at all over-engineered!

Over engineered in all the right places. :)

I've got it!  XML is the C of data languages!

The specification is so small and elegant... well, relative to SGML before it.
 It's a portable assembly^Wdata format!  There's so little to learn.  Tags,
attributes and data.  What could be simpler?

Oh, you want to actually DO something with it?  Here, bolt on this GIGANTIC
PILE of standard libraries^W^Hs.  Don't forget to write an XML Schema so you
can know what all those tags mean, or maybe you get a DTD we haven't worked
that bit out yet.  And don't forget your XSLT so you can translate this stuff.
 And CSS so you can make it easy on the eye.  And XPath so you can search it.
 And XQuery because you might be braindead enough to try to use this as a
database.  And XForms to allow users to change your XML.  And...

Oh, and they're all written by different groups so they all work kinda
differently with different feels.

Oh, you want to read and write it?  Here's a pile of special XML generating
editors, libraries, formatters and translators.  Dear god, don't think you
could read and write it by hand!

But don't worry, the language itself is so small and simple and elegant.

From: A. Pagaltzis
Date: 01:17 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

* Michael G Schwern <schwern@xxxxx.xxx> [2007-09-28 01:10]:
> Tony Finch wrote:
> > On Thu, 27 Sep 2007, Michael G Schwern wrote:
> >> I would like, at this point, to pimp YAML a little.
> > 
> > Not at all over-engineered!
> 
> Over engineered in all the right places. :)

YAML is a gigantic pile of suck. If it corresponded to the
feature set of JSON, it would be fine, but good grief, take
your complexity fetish and stick it somewhere unmentionable.

As it is, I'll have JSON instead please, thankyouverymuch.

> Oh, you want to actually DO something with it? Here, bolt on
> this GIGANTIC PILE of standard libraries^W^Hs. Don't forget to
> write an XML Schema so you can know what all those tags mean,
> or maybe you get a DTD we haven't worked that bit out yet. And
> don't forget your XSLT so you can translate this stuff. And CSS
> so you can make it easy on the eye. And XPath so you can search
> it. And XQuery because you might be braindead enough to try to
> use this as a database. And XForms to allow users to change
> your XML. And...

Yes, I'm sure you can come up with a way to express an XHTML
document in YAML that would be 200x more readable and positively
trivial to process. You and every other genius ranting about XML.

Regards,

From: Michael G Schwern
Date: 02:39 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

A. Pagaltzis wrote:
> YAML is a gigantic pile of suck. If it corresponded to the
> feature set of JSON, it would be fine, but good grief, take
> your complexity fetish and stick it somewhere unmentionable.
> 
> As it is, I'll have JSON instead please, thankyouverymuch.

But JSON is YAML. :)  JSON even removed comments to make that true.

And JSON has all that distracting mandatory quoting and braces that makes the
eyes bleed.  Blech.  I don't want to have to write that.

{
    "targets" : {
        "compile" : {
            "descripton" : "compile the source",
            "actions" : [
                {
                    "javac" : {
                        "srcdir" : "${src}",
                        "destdir" : "${build}"
                    }
                }
            ],
            "depends" : "init"
        },
        "dist" : {
            "actions" : [
                {
                    "mkdir" : {
                        "dir" : "${dist}/lib/"
                    }
                },
                {
                    "jar" : {
                        "jarfile" : "${dist}/lib/MyProject-${DSTAMP}.jar",
                        "basedir" : "${build}"
                    }
                }
            ],
            "description" : "generate the distribution",
            "depends" : "compile"
        },
        "clean" : {
            "actions" : {
                "delete" : [
                    {
                        "dir" : "${build}"
                    },
                    {
                        "dir" : "${dist}"
                    }
                ]
            },
            "description" : "clean up"
        },
        "init" : {
            "actions" : [
                "tstamp",
                {
                    "mkdir" : {
                        "dir" : "${build}"
                    }
                }
            ]
        }
    },
    "name" : "MyProject",
    "default" : "dist",
    "description" : "simple example build file",
    "basedir" : ".",
    "properties" : {
        "src" : {
            "location" : "src/"
        },
        "dist" : {
            "location" : "dist/"
        },
        "build" : {
            "location" : "build/"
        }
    }
}

Anyhow, they translate trivially which means YAML vs JSON formatting wars are
moot.

$ cat `which yaml2json`
#!/usr/bin/perl -w

use strict;
use YAML ();
use JSON ();

my $json = JSON->new(pretty => 1, indent => 4);
print $json->objToJson(YAML::Load(join "", <>));

$ cat `which json2yaml`
#!/usr/bin/perl -w

use strict;
use YAML;
use JSON;

print YAML::Dump( jsonToObj( join "", <> ) );

>> Oh, you want to actually DO something with it? Here, bolt on
>> this GIGANTIC PILE of standard libraries^W^Hs. Don't forget to
>> write an XML Schema so you can know what all those tags mean,
>> or maybe you get a DTD we haven't worked that bit out yet. And
>> don't forget your XSLT so you can translate this stuff. And CSS
>> so you can make it easy on the eye. And XPath so you can search
>> it. And XQuery because you might be braindead enough to try to
>> use this as a database. And XForms to allow users to change
>> your XML. And...
> 
> Yes, I'm sure you can come up with a way to express an XHTML
> document in YAML that would be 200x more readable and positively
> trivial to process. You and every other genius ranting about XML.

No, I would not, because that's not what YAML is for (and HTML is it's own
writhing pile of hate and its redesign I wouldn't touch with any length pole).

It's designed for a specific, yet common and otherwise unfulfilled, purpose.
YAML is a data serialization language for dynamic languages that's human and
machine readable.  It's core data types are designed to map easily to and from
the way dynamic languages and humans think.  Scalars, lists and pairs.

	Name: 		____________
	Address:	____________
	Phone Number:	____________
	Last Three Employers:
	____________________________
	____________________________
	____________________________

It's not a magical Try To Bolt Everything On And Suck At Most Of Them format.
 But people shove XML into that role all the time, that's what it's being sold
as, the ONE tool for data modeling.  Just encode it in XML and everyone will
be able to read it!  Bullshit.

(X|SG)ML tries to model the whole universe as a big tree while the universe
itself rarely does that.  This leads to all sorts of heavy-weight translation
issues mapping stuff to and from XML whether it's trying to take a real-world
concept from your head and encoding it as XML or trying to make a usable data
structure for your program out of an XML tree.  Translation of concepts
increases costs and adds to the complexity of data transmission, exactly the
thing XML is supposed to prevent.  It just pushes the cost from parsing the
data to mapping the concepts and that's hateful.

Trying to do EVERYTHING with one format is hateful.  It's ok to have more than
one tool.

From: A. Pagaltzis
Date: 17:00 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

* Michael G Schwern <schwern@xxxxx.xxx> [2007-09-28 03:45]:
> $ cat `which yaml2json`
> #!/usr/bin/perl -w
> 
> use strict;
> use YAML ();
> use JSON ();
> 
> my $json = JSON->new(pretty => 1, indent => 4);
> print $json->objToJson(YAML::Load(join "", <>));

Let me know how that works out for YAML that contains references,
data type annotations and the like.

> Trying to do EVERYTHING with one format is hateful. It's ok to
> have more than one tool.

Yes, so raving about one of them as if it was quintessentially
bad is kinda pointless.

Regards,

From: Michael G Schwern
Date: 01:11 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

A. Pagaltzis wrote:
> * Michael G Schwern <schwern@xxxxx.xxx> [2007-09-28 03:45]:
>> $ cat `which yaml2json`
>> #!/usr/bin/perl -w
>>
>> use strict;
>> use YAML ();
>> use JSON ();
>>
>> my $json = JSON->new(pretty => 1, indent => 4);
>> print $json->objToJson(YAML::Load(join "", <>));
> 
> Let me know how that works out for YAML that contains references,
> data type annotations and the like.

Nobody actually uses that crap (and no, don't go find someone that does).


>> Trying to do EVERYTHING with one format is hateful. It's ok to
>> have more than one tool.
> 
> Yes, so raving about one of them as if it was quintessentially
> bad is kinda pointless.

Except here on hates-software pointlessness is the point!

From: Peter da Silva
Date: 04:38 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 27-Sep-2007, at 18:06, Michael G Schwern wrote:
> I've got it!  XML is the C of data languages!

Fuck NO.

C is actually useful all by itself.

It's a little funky, it's grown a little ad-hoc-ishly, it's not  
formally anything. And it does what it's supposed to.

Like runoff/nroff/troff/groff.

C++, I think you're thinking of C++.

From: Peter da Silva
Date: 04:08 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 27-Sep-2007, at 06:32, tgies wrote:
> But, for pity's sake, if you are going to define a new XML schema,
> please do not be a complete retard.

You mean

         <key>AreaCode</key>
         <string>281</string>
         <key>City</key>
         <string>Houston</string>
         <key>Company</key>
         <string></string>

Isn't the preferred way to do it?

From: tgies
Date: 12:56 on 27 Sep 2007
Subject: XML Schemas: Some Ground Rules

Attention jerks,
Okay, so you're going to use XML for every imaginable thing which you
can possibly contrive a way to use XML for, including uncompressed RGB
raster images, large relational databases, and the syntax for new
procedural imperative programming languages. Fine. Fine. I suppose I
can't stop you. After all, it is somehow considered new and it is
somehow considered Web 2.0 and there is an X in the name.

But, for pity's sake, if you are going to define a new XML schema,
please do not be a complete retard. XML has these things called
"attributes" for a reason. You do not need to define a separate tag
for every possible property an element in your XML document may have.
There is a reason that it is <font color="#FF0000" face="Comic Sans
MS">Hello</font> and not
<font>
 <color>#FF0000</color>
 <face>Comic Sans MS</face>
 <text>Hello</text>
</font>. Oh my God.

From: Peter Pentchev
Date: 13:06 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules


--qDbXVdCdHGoSgWSk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Sep 27, 2007 at 06:56:06AM -0500, tgies wrote:
> Attention jerks,
> Okay, so you're going to use XML for every imaginable thing which you
> can possibly contrive a way to use XML for, including uncompressed RGB
> raster images, large relational databases, and the syntax for new
> procedural imperative programming languages. Fine. Fine. I suppose I
> can't stop you. After all, it is somehow considered new and it is
> somehow considered Web 2.0 and there is an X in the name.
>=20
> But, for pity's sake, if you are going to define a new XML schema,
> please do not be a complete retard. XML has these things called
> "attributes" for a reason. You do not need to define a separate tag
> for every possible property an element in your XML document may have.
> There is a reason that it is <font color=3D"#FF0000" face=3D"Comic Sans
> MS">Hello</font> and not
> <font>
>  <color>#FF0000</color>
>  <face>Comic Sans MS</face>
>  <text>Hello</text>
> </font>. Oh my God.

Erm... you do actually know that this - why do the SGML-derived mark-up
languages have both elements and attributes and what should be an
element and what should be an attribute - is an argument (or a religious
war, whichever way you look at it) that has been going for more than a
decade now, right?

G'luck,
Peter

--=20
Peter Pentchev	roam@xxxxxxx.xxx    roam@xxxxx.xx    roam@xxxxxxx.xxx
PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
Do you think anybody has ever had *precisely this thought* before?

--qDbXVdCdHGoSgWSk
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFG+5y07Ri2jRYZRVMRAv7jAKCjgF3SHGtT3JbSxnFzOSJrwE7pKQCgunxh
lw6Yr5Fl3YZ4yKuASjeEBZ8=
=kQDl
-----END PGP SIGNATURE-----

--qDbXVdCdHGoSgWSk--

From: tgies
Date: 13:27 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 9/27/07, Peter Pentchev <roam@xxxxxxx.xxx> wrote:

> Erm... you do actually know that this - why do the SGML-derived mark-up
> languages have both elements and attributes and what should be an
> element and what should be an attribute - is an argument (or a religious
> war, whichever way you look at it) that has been going for more than a
> decade now, right?

I would argue that it is pretty straightforward and that everyone
needs to shut up. I would further argue that I have never seen anyone
who actually cares to offer an opinion on the matter at all not offer
essentially the same basic set of rules. Inherent essential properties
are attributes. "Tangible" subsections of an element are subelements.
If there's only one of something, it likely needs to be an attribute.
If it's the unique ID of some element, it should not be a God damned
child element of that element.

Finally, I would argue that anyone who thinks that the <font> example
I gave in my original message could ever be a remotely good idea needs
to get off of computers.

Just because there are holy wars about something doesn't make either
side of the issue being warred over sacred. One side may very well be
stupid and uninformed and just pulling things out of their asses.

> G'luck,
> Peter
>
> --
> Peter Pentchev  roam@xxxxxxx.xxx    roam@xxxxx.xx    roam@xxxxxxx.xxx
> PGP key:        http://people.FreeBSD.org/~roam/roam.key.asc
> Key fingerprint FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
> Do you think anybody has ever had *precisely this thought* before?
>
>

From: demerphq
Date: 13:43 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 9/27/07, tgies <tgies@xxxxx.xxx> wrote:
> On 9/27/07, Peter Pentchev <roam@xxxxxxx.xxx> wrote:
>
> > Erm... you do actually know that this - why do the SGML-derived mark-up
> > languages have both elements and attributes and what should be an
> > element and what should be an attribute - is an argument (or a religious
> > war, whichever way you look at it) that has been going for more than a
> > decade now, right?
>
> I would argue that it is pretty straightforward and that everyone
> needs to shut up. I would further argue that I have never seen anyone
> who actually cares to offer an opinion on the matter at all not offer
> essentially the same basic set of rules. Inherent essential properties
> are attributes. "Tangible" subsections of an element are subelements.
> If there's only one of something, it likely needs to be an attribute.
> If it's the unique ID of some element, it should not be a God damned
> child element of that element.

Hear hear. From what ive seen the "attributes are evil use tags" crowd
usually justify their position by essentially stating that they are
crap schema designers who want to cover up for their lack of
foresightedness and planning and design skills by ensuring that they
will have every opportunity in the future to correct things. Which is
like a building designer putting all the wiring on the outside of the
walls because just maybe sometime in the future somebody will want to
put a socket somewhere unanticipated. Well, fucking do it right the
first time damnit. Thats what "design" is for.

Cheers,
yves
ps: I have a headache, so there may be more vitriol in this post than
is strictly necessary.

From: Nicholas Clark
Date: 00:39 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On Thu, Sep 27, 2007 at 02:43:27PM +0200, demerphq wrote:

> Cheers,
> yves
> ps: I have a headache, so there may be more vitriol in this post than
> is strictly necessary.

On Thu, Sep 27, 2007 at 08:03:16PM +0200, demerphq wrote:

> Anyway, XML is hateful, full stop. We are just quibbling over the
> exact extent and nature of the hate. :-)

I think I preferred your attitude when you had the headache. When mailing
this list, please could you pretend that you had a headache still? :-)

[If necessary, think of Lotus Notes and pound your head into the keyboard a
few times. That should do the trick]

Nicholas Clark

From: Robert Rothenberg
Date: 15:52 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 28/09/07 00:39 Nicholas Clark wrote:

> [If necessary, think of Lotus Notes and pound your head into the keyboard a
> few times. That should do the trick]

Isn't that redundant?

From: A. Pagaltzis
Date: 02:05 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

* demerphq <demerphq@xxxxx.xxx> [2007-09-27 14:50]:
> Hear hear. From what ive seen the "attributes are evil use
> tags" crowd usually justify their position by essentially
> stating that they are crap schema designers who want to cover
> up for their lack of foresightedness and planning and design
> skills by ensuring that they will have every opportunity in the
> future to correct things. Which is like a building designer
> putting all the wiring on the outside of the walls because just
> maybe sometime in the future somebody will want to put a socket
> somewhere unanticipated. Well, fucking do it right the first
> time damnit. Thats what "design" is for.

Yeah! And while you're at it, your methods should be "static
final" by default, you lazy slob! God, the clueless idiots who
leave all their methods overridable. No design skills whatsoever.

And let's not even talk about those "dynamic" language proponents.
Can't make up their minds about the types of their variables, the
nimrods.

* demerphq <demerphq@xxxxx.xxx> [2007-09-27 23:30]:
> On 9/27/07, Adam Atlas <adam@xxxxx.xx> wrote:
> > I think a decent rule of thumb is that attributes are for
> > parameters that are not displayed to the user directly, while
> > sub-elements are for structuring content that is displayed.
> > Attributes can affect the display, of course, but their raw
> > content should (usually) not be.
> 
> I think that the rule of thumb expressed earlier by tgies is a
> better one. When a piece of data is an inherent property of a
> tag, which cannot have children, then it should be an
> attribute. When it can have children of its own then it should
> be a tag.

The correct rule of thumb is narrower and more precise.

XML is MARKUP. That means the document text should be TEXT.
Everything that is *not* the document text should NOT BE IN the
document text.

That's it.

If you are designing an XML vocabulary that needs to wrap bits of
document in bits of metadata, then for the metadata, there is
another narrow, precise, simple rule: if it's an atomic value,
and you know there will only be one of them, then an attribute is
the right place. URIs belong in attributes, f.ex. Datetimes are a
good candidate. Stuff like that. Note that "you know there can
only be exactly one of that" doesn't mean "you can't anticipate
there ever being a need for more than one", it means "there
cannot be more than one *by definition*".

The `href` attribute in HTML links is a good example of
attributes used correctly, on both counts.

And if you are designing an XML vocabulary that consists only of
metadata, then please take a cold shower right now and when you
come back, go look for a better tool for the job than XML.

> I don't buy the "its all the same" argument. Its not. Tags are
> containers which can contain other containers, attributes are
> inherent properties of the tag to which they belong.

Yeah. `<a href>` (again) is a great refutation of the "attributes
are superfluous" fallacy.

> Anyway, XML is hateful, full stop. We are just quibbling over
> the exact extent and nature of the hate. :-)

It's the worst solution, except for all the others. Don't forget
that SGML and XML are rooted in the document/markup world, and
that's what they are for. They are not universal data structure
serialisation languages, they are not glorified CSV, they are not
generic grammars for each and every syntax with nestable blocks,
they are not a hammer in a world of nails. People who insist on
making every task a nail for which to use the XML hammer suck,
but so do most people who rant about XML.

Regards,

From: Peter da Silva
Date: 04:16 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

> If there's only one of something, it likely needs to be an attribute.

Good rule.

I like the way Konfabulator does it.

They go, like, "our XML parser doesn't care whether a unique whatever- 
you-call-it is an attribute or a nested tag, do whatever floats your  
boat, they're the same bloody thing as far as we're concerned".

ObHate: Of course the way Konfabulator started throwing away all my  
widget settings and reinstalls their stupid demos every time I  
upgrade it after it became Yahoo Widgets is pretty hateful, but it's  
way less hateful than Dashboard, so I put up with it.

From: Daniel Pittman
Date: 13:22 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

tgies <tgies@xxxxx.xxx> writes:

> Okay, so you're going to use XML for every imaginable thing which you
> can possibly contrive a way to use XML for, including uncompressed RGB
> raster images, large relational databases, and the syntax for new
> procedural imperative programming languages. Fine. Fine. I suppose I
> can't stop you. After all, it is somehow considered new and it is
> somehow considered Web 2.0 and there is an X in the name.

...you were going so well and then, suddenly...

> But, for pity's sake, if you are going to define a new XML schema,
> please do not be a complete retard. XML has these things called
> "attributes" for a reason. 

Yes!  Attributes.  We need those in SGML ^W XML because they have much
more liberal rules than those pesky tags.  You don't need to specify
what they contain in any meaningful way in the DTD.

> You do not need to define a separate tag for every possible property
> an element in your XML document may have.

This is very important because it allows us to define an additional
vocabulary for our document that can all sorts of useful things such as:

Multi-value fields split by whatever character seemed like a good idea
at the time.  Unstructured data.  Random data imported from the
scripting language of the day.  Other languages.  Binary data, encoded
in some random format.

> There is a reason that it is 
> <font color="#FF0000" face="Comic Sans MS">Hello</font> 
> and not 
> <font>
>  <color>#FF0000</color>
>  <face>Comic Sans MS</face>
>  <text>Hello</text>
> </font>. 

Why, yes.  Yes, there is.  Imposing all the extra structure required for
the second form would, like, burden the DTD author and the poor computer
that has to read the content.

How do we know this is the case and not, say, that the attribute form
has real value?

Why, because there is absolutely no different between the two forms.
The two document examples contain exactly the same information content.

The second, though, requires that you had more structure, the former
doesn't.

> Oh my God.

Something like that.

It would be really nice if this whole SGML thing was less crazy but,
hey, I gave up that particular dream quite a while back...

        Daniel

From: tgies
Date: 13:41 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 9/27/07, Daniel Pittman <daniel@xxxxxxxx.xxx> wrote:
> ...you were going so well and then, suddenly...

I think one of us is missing something here, because you appear
actually to be agreeing with me completely?

tgies

From: Daniel Pittman
Date: 14:32 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

"Tony Gies" <tony.gies@xxxxx.xxx> writes:
> On 9/27/07, Daniel Pittman <daniel@xxxxxxxx.xxx> wrote:
>> ...you were going so well and then, suddenly...
>
> I think one of us is missing something here, because you appear
> actually to be agreeing with me completely?

One of us must be.  Perhaps it was my sarcasm, perhaps I completely
misunderstood your point.  To help clear this up:

There is no difference in the information conveyed using a child tag or
an attribute of a tag.

The only difference between the two, in SGML, is that attributes have
less rules applied.  They can, and often are, more informal.

Their existing is a mistake.  Their use is typically[1] a sign that the
developer of the ML failed to understand the essential sameness of the
two expressions /or/ decided that they wanted the lazy path.

In other words: SGML would be vastly better without attributes at all,
ever.  They are a waste of space, time and effort.

Though, in fairness[2], without them SGML would never be the raging hit
it is today.  If we asked the horde of monkeys that designed most of the
SGML in use around us to get by without their escape clause when a real
markup language got to be too much they wouldn't have used it.

Then I wouldn't get to look at another home page with a picture of a
kitten and an "under construction" animated GIF on it because we would
all be using gopher and whittling our IP packets out of wood.

        Daniel

Footnotes: 
[1]  Not always.  Sometimes they are nothing more than a style variant
     introduced for no good reason.

[2]  ...and back to the cynical and caustic side of things.

From: Sean Conner
Date: 18:33 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

It was thus said that the Great Daniel Pittman once stated:
> "Tony Gies" <tony.gies@xxxxx.xxx> writes:
> > On 9/27/07, Daniel Pittman <daniel@xxxxxxxx.xxx> wrote:
> >> ...you were going so well and then, suddenly...
> >
> > I think one of us is missing something here, because you appear
> > actually to be agreeing with me completely?
> 
> One of us must be.  Perhaps it was my sarcasm, perhaps I completely
> misunderstood your point.  To help clear this up:
> 
> There is no difference in the information conveyed using a child tag or
> an attribute of a tag.
> 
> The only difference between the two, in SGML, is that attributes have
> less rules applied.  They can, and often are, more informal.
> 
> Their existing is a mistake.  Their use is typically[1] a sign that the
> developer of the ML failed to understand the essential sameness of the
> two expressions /or/ decided that they wanted the lazy path.

  So, you're saying you would rather have:

	<p><a>I grew up reading Uncle Scrooge<href>2005/01/21.1</href></a>,
	but by the time
	<a><href>http://en.wikipedia.org/wiki/Don_Rosa</href>Don Rosa</a>
	started drawing the comic, I had stopped reading comic books in
	general. And while I had heard of Don Rosa as an Uncle Scrooge
	artist, I had no idea <a>how good
	<href>http://www.2719hyperion.com/2007/05/don-rosas-son-of-sun.html</href>he
	was</a>, in both art and story.

  Ick.  Or does it go:

	<p>
	  <a>
	    <href>2005/01/21.1</href>
	    <text>I grew up reading Uncle Scrooge</text>
	  </a>
	  <text>, but by the time </text>
	  <a>
	    <href>http://en.wikipedia.org/wiki/Don_Rosa</href>;
	    <text>Don Rosa</text>
	  </a>
	  <text>
		started drawing the comic, I had stopped reading comic books in
	        general. And while I had heard of Don Rosa as an Uncle Scrooge
	        artist, I had no idea 
	  </text>
	  <a>
	    <text>how good he was</text>
	    <href>http://www.2719hyperion.com/2007/05/don-rosas-son-of-sun.html</href>;
	  </a>
	  <text>, in both art and story.</text>
	</p>

  Hmmm ... Microsoft Word anyone?

> Then I wouldn't get to look at another home page with a picture of a
> kitten and an "under construction" animated GIF on it because we would
> all be using gopher and whittling our IP packets out of wood.

  -spc (while still waiting for the hypertexty goodness of Xanadu?)

From: Adam Atlas
Date: 18:56 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 27 Sep 2007, at 13:33, Sean Conner wrote:

>   So, you're saying you would rather have:
>
> 	[...]
>
>   Ick.  Or does it go:
>
> 	[...]
>

Ick indeed!

I think a decent rule of thumb is that attributes are for parameters  
that are not displayed to the user directly, while sub-elements are  
for structuring content that is displayed. Attributes can affect the  
display, of course, but their raw content should (usually) not be.

Another rule of thumb is that if this is not applicable, because all  
or none of your data is to be displayed directly to the user, then  
you should probably not be using XML.

From: demerphq
Date: 19:03 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 9/27/07, Adam Atlas <adam@xxxxx.xx> wrote:
> On 27 Sep 2007, at 13:33, Sean Conner wrote:
>
> >   So, you're saying you would rather have:
> >
> >       [...]
> >
> >   Ick.  Or does it go:
> >
> >       [...]
> >
>
> Ick indeed!
>
> I think a decent rule of thumb is that attributes are for parameters
> that are not displayed to the user directly, while sub-elements are
> for structuring content that is displayed. Attributes can affect the
> display, of course, but their raw content should (usually) not be.

I think that the rule of thumb expressed earlier by tgies is a better
one. When a piece of data is an inherent property of a tag, which
cannot have children, then it should be an attribute. When it can have
children of its own then it should be a tag.

I don't buy the "its all the same" argument. Its not. Tags are
containers which can contain other containers, attributes are inherent
properties of the tag to which they belong.

Anyway, XML is hateful, full stop. We are just quibbling over the
exact extent and nature of the hate. :-)

Yves

From: Peter da Silva
Date: 04:33 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 27-Sep-2007, at 13:03, demerphq wrote:
> I don't buy the "its all the same" argument. Its not. Tags are
> containers which can contain other containers, attributes are inherent
> properties of the tag to which they belong.

Both are part of the contents of the tag.

I will agree that you shouldn't shove complex structure inside an  
attribute, and whoever came up with the "style=" attribute in HTML  
needs to be taught a lesson using red hot cast iron angle brackets,  
but for things that can be treated lexically as attributes there's no  
point wasting space making them separate tags.

I realize that there are people who think it's important that object  
methods and object variables are different things and should have  
different syntax, or that it's actually a good thing that sockets are  
not in the file system or that System V shared memory segments are  
not in the file system, or that it's a good thing that sockets and  
file descriptors in Windows are distinct, or that it really matters  
whether my email address looks like "Peter da Silva  
<peter@xxxxxxx.xxx>" instead of "peter@xxxxxxx.xxx (Peter da Silva)".  
But I am not one of them. I don't even think distinguishing objects  
and classes was a bright idea.

From: Daniel Pittman
Date: 01:29 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

Sean Conner <spc@xxxxxx.xxx> writes:
> It was thus said that the Great Daniel Pittman once stated:
>> "Tony Gies" <tony.gies@xxxxx.xxx> writes:
>> > On 9/27/07, Daniel Pittman <daniel@xxxxxxxx.xxx> wrote:
>> >> ...you were going so well and then, suddenly...
>> >
>> > I think one of us is missing something here, because you appear
>> > actually to be agreeing with me completely?
>> 
>> One of us must be.  Perhaps it was my sarcasm, perhaps I completely
>> misunderstood your point.  To help clear this up:
>> 
>> There is no difference in the information conveyed using a child tag or
>> an attribute of a tag.
>> 
>> The only difference between the two, in SGML, is that attributes have
>> less rules applied.  They can, and often are, more informal.
>> 
>> Their existing is a mistake.  Their use is typically[1] a sign that the
>> developer of the ML failed to understand the essential sameness of the
>> two expressions /or/ decided that they wanted the lazy path.
>
>   So, you're saying you would rather have:
>
> <p><a>I grew up reading Uncle Scrooge<href>2005/01/21.1</href></a>,

I am saying that there is no significant difference between the level of
information that (or the other format with the extra text tag) and the
form that uses attributes.

I am also saying that the attributes have more relaxed rules applied
than the tags, something that is essentially a mistake in my opinion,
because it creates a "get out of design free" card.

Finally, of note: we were talking about the abomination of XML that was
designed by taking SGML, pulling anything designed to make it human or
author friendly out, then claiming that this was all done for the best
because we can much better afford to spend human brain "cycles" than a
few CPU cycles dealing with the added complexity.

        Daniel

Why, yes, XML is a complete waste of effort that has added zero value on
top of the traditional SGML.

From: A. Pagaltzis
Date: 02:20 on 28 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

* Daniel Pittman <daniel@xxxxxxxx.xxx> [2007-09-28 02:40]:
> we were talking about the abomination of XML that was designed
> by taking SGML, pulling anything designed to make it human or
> author friendly out, then claiming that this was all done for
> the best because we can much better afford to spend human brain
> "cycles" than a few CPU cycles dealing with the added
> complexity.

Err, was there ever a complete implementation of SGML? No? For a
reason.

Not that I would have minded *some* of the minimisation shortcuts
of SGML, mind.

Though, being able to verify well-formedness of a document without
having a schema is a huge win, because it drastically lowers the
barrier to getting ad-hoc work done. That alone is in conflict
with many syntactic shortcuts.

Also if you insist on using all shortcuts available in full SGML
you can easily produce a cryptic document that a human has
trouble reading too.

I really wish they had kept at least the `</>` shortcut, though.
Plus a few others that don't conflict with schemaless well-
formedness checking, like whatever it was that closes the last
open tag and opens a new one with the same name, which is nice
for writing sequences of list items or paragraphs or such.

Regards,

From: tgies
Date: 19:57 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 9/27/07, Daniel Pittman <daniel@xxxxxxxx.xxx> wrote:
> One of us must be.  Perhaps it was my sarcasm, perhaps I completely
> misunderstood your point.  To help clear this up:

Okay, see, I thought you were being sarcastic the other way around,
because I can't imagine why anyone would honestly think that.

tgies

From: Michael G Schwern
Date: 23:52 on 27 Sep 2007
Subject: Ant and oh god don't make me write XML and more YAML (was Re: XML Schemas: Some Ground Rules)

Daniel Pittman wrote:
> One of us must be.  Perhaps it was my sarcasm, perhaps I completely
> misunderstood your point.  To help clear this up:
> 
> There is no difference in the information conveyed using a child tag or
> an attribute of a tag.
> 
> The only difference between the two, in SGML, is that attributes have
> less rules applied.  They can, and often are, more informal.

And that the attribute form is a hell of a lot more compact and easier to
eyeball.  Sometimes humans have to read and write this crap, ya know?

As insane as it is that anyone would pick XML as a human data format.  I'm
looking at YOU Ant!

Here's the "simple" Ant Buildfile from their documentation.
http://ant.apache.org/manual/using.html#example

<project name="MyProject" default="dist" basedir=".">
    <description>
        simple example build file
    </description>
  <!-- set global properties for this build -->
  <property name="src" location="src"/>
  <property name="build" location="build"/>
  <property name="dist"  location="dist"/>

  <target name="init">
    <!-- Create the time stamp -->
    <tstamp/>
    <!-- Create the build directory structure used by compile -->
    <mkdir dir="${build}"/>
  </target>

  <target name="compile" depends="init"
        description="compile the source " >
    <!-- Compile the java code from ${src} into ${build} -->
    <javac srcdir="${src}" destdir="${build}"/>
  </target>

  <target name="dist" depends="compile"
        description="generate the distribution" >
    <!-- Create the distribution directory -->
    <mkdir dir="${dist}/lib"/>

    <!-- Put everything in ${build} into the MyProject-${DSTAMP}.jar file -->
    <jar jarfile="${dist}/lib/MyProject-${DSTAMP}.jar" basedir="${build}"/>
  </target>

  <target name="clean"
        description="clean up" >
    <!-- Delete the ${build} and ${dist} directory trees -->
    <delete dir="${build}"/>
    <delete dir="${dist}"/>
  </target>
</project>

Oh GOD!  I'm supposed to read that?!  And write it?!  Willingly?!  It makes
make look positively beautiful.

Here's the same thing in YAML.

name:           MyProject
default:        dist
basedir:        .
description:    simple example build file

# set global properties for this build
properties:
    src:
        location:   src/
    build:
        location:   build/
    dist:
        location:   dist/

targets:
    init:
        actions:
            - tstamp
            - mkdir: { dir: '${build}' }

    compile:
        depends:    init
        descripton: compile the source
        actions:
            # Compile the java code from ${src} into ${build}
            - javac: { srcdir: "${src}", destdir: "${build}" }

    dist:
        depends:        compile
        description:    generate the distribution
        actions:
            # Create the distribution directory
            - mkdir:  { dir: '${dist}/lib/' }

            # Put everything in ${build} into MyProject-${DSTAMP}.jar
            - jar:
                jarfile: '${dist}/lib/MyProject-${DSTAMP}.jar'
                basedir: '${build}'

    clean:
        description:    clean up
        actions:
            # Delete the ${build} and ${dist} directory trees
            delete:
                - { dir: '${build}' }
                - { dir: '${dist}'  }

Note the almost complete lack of scaffolding code (text that's conveying only
syntactical structure).  Just not having to worry about balancing tags alone
makes my brain leap in joy!  Different things look different, it's not a
homogenous mud of angle brackets.  Comments stand out.  Lists stand out.
Key/value pairs stand out and can be sensibly lined up.  Why?  Because it
doesn't try to pretend the entire universe can be rammed into just a tree
structure.  The real world uses lists and pairs.

And that's just a rote translation.  If it started out with the
scalar/list/hash mindset of YAML instead of <phrase><word
type=article>the</word><word type=adjective>nested</word><word
type=noun>tree</word><word type=foul>crap</word><punctuation>,</punctuation> I
could probably come out with something even better.

From: Daniel Pittman
Date: 01:39 on 28 Sep 2007
Subject: Re: Ant and oh god don't make me write XML and more YAML

Michael G Schwern <schwern@xxxxx.xxx> writes:
> Daniel Pittman wrote:
>
>> One of us must be.  Perhaps it was my sarcasm, perhaps I completely
>> misunderstood your point.  To help clear this up:
>> 
>> There is no difference in the information conveyed using a child tag or
>> an attribute of a tag.
>> 
>> The only difference between the two, in SGML, is that attributes have
>> less rules applied.  They can, and often are, more informal.
>
> And that the attribute form is a hell of a lot more compact and easier
> to eyeball.  Sometimes humans have to read and write this crap, ya
> know?

Oh, believe me, I *KNOW* that.  Having to deal with more and more XML
crap at the protocol level has really driven that home.

Why, yes, EPP, I /am/ looking at you.  Felching miserable half-caste
screwed up abortion of a  protocol.  

I mean, seriously.  You want XML for data exchange, fine.  Use it.  It
is a terrible choice but, hey, you wrote these rules and it is better
than making up a standard.  (Well, probably.)

But, hey, that wasn't enough was it?  OH, no.  We need to be special and
we want to send multiple XML messages over a single TCP connection.

Well, then, what should we do?  Why, add a 32bit binary number before
each XML message to give the length, adding binary framing to an
otherwise text only protocol.

Yes!  Brilliant!  Because XML isn't, you know, self-framing or
anything.  Especially good because binary data is efficient so we don't
waste any bytes sending an ASCII or line oriented size field either.

That saves bandwidth for when we are transmitting the human formatted,
filled with insignificant whitespace XML data around!

Anyway, if you rely on the self framing aspect of it you could end up in
all sorts of trouble.  Why, when TCP drops data bytes at random (as it
is so prone to doing) you could end up missing the end of a frame and
getting out of sync!  Disaster!

> As insane as it is that anyone would pick XML as a human data format.
> I'm looking at YOU Ant!

Y hullo thar!  Ant!  Java!  If you wanted to ask how XML could be made
worse the answer lies in there somewhere.  Oh, yes.

I have avoided working with Ant so for because, frankly, if I am ever
fored to use it I probably /will/ buy a plane ticket, fly to the US and
stab James Gosling in the head with a spoon while screaming 

    "You worked with Lisp and Scheme!  You know what a good mechanism
     for expressing syntax is!  How! Could! You! Do! This! To! Us!"

...and, you know, that whole frothing at the mouth thing is a fashion
disaster.  People would talk.

[...]

> Oh GOD!  I'm supposed to read that?!  And write it?!  Willingly?!  It
> makes make look positively beautiful.

Ah, but at least it is all case sensitive so the CPU doesn't have to
work so hard to read it.  Much better than those nasty SGML
applications.

        Daniel

From: Phil Pennock
Date: 08:57 on 28 Sep 2007
Subject: Re: Ant and oh god don't make me write XML and more YAML

On 2007-09-28 at 10:39 +1000, Daniel Pittman wrote:
> Why, yes, EPP, I /am/ looking at you.  Felching miserable half-caste
> screwed up abortion of a  protocol.  

> Well, then, what should we do?  Why, add a 32bit binary number before
> each XML message to give the length, adding binary framing to an
> otherwise text only protocol.

Bah, that's nothing.  Read the protocol spec for BEEP:
 RFC 3080 The Blocks Extensible Exchange Protocol Core
 RFC 3081 Mapping the BEEP Core onto TCP

BEEP is the foundation for things like the inter-registrar protocols for
DNS domains.  Then there are RFCs to nest SOAP and XML-RPC in BEEP.  Oh,
and NETCONF for managing network devices (using XML)?  They couldn't
agree on a standard transport, so you get to pick between the equally
standards-blessed SSH, SOAP and BEEP.

So, multiplexed command-response with multiple channels within a
connection, independent messages within channels, with a header line
containing those numbers and a length and a trailer line of END.  Within
that, you have MIME headers followed by the content, which is
technically arbitrary 8-bit data, except that all of the BEEP
maintenance protocol stuff puts XML in there.

The channels are independent, except when they're not (activating TLS
turns it on for all channels, which fit within TLS, thus immediately
breaking the IO channel abstraction).

Authentication is provided via SASL.  Again, encoded inside XML, with
things like CRAM-MD5 challenges encoded inside <![CDATA[...]]>, _and_
first wrapped with an XML 'blob' element.  To quote a reply from the
RFC:

 S: RPY 0 1 . 221 185
 S: Content-Type: application/beep+xml
 S:
 S: <profile uri='http://iana.org/beep/SASL/CRAM-MD5'>;
 S: <![CDATA[<blob>PDE4OTYuNjk3MTcwOTUyQHBvc3RvZmZpY2UucmVzdG9uLm1jaS5uZXQ+</blob>]]>
 S: </profile>
 S: END

And yes, it's BEEP which mandates (via a SASL profile) that the binary
data of the challenge must be encoded as a base64 string.

I've never had to write to BEEP.  I'm grateful.  The fact that this
protocol exists bothers me.

-Phil

From: Aaron Crane
Date: 09:43 on 28 Sep 2007
Subject: Re: Ant and oh god don't make me write XML and more YAML

Daniel Pittman writes:
> Because XML isn't, you know, self-framing or anything.

It's not entirely self-framing, no.  Here is a well-formed XML
instance:

  <a/>
  <?a?>

Here is another:

  <?b?>
  <b/>

If you concatenate them, you can't tell which PI goes with which
instance.  (Though the problem goes away if you require an XML
declaration on every instance; an XML declaration if present must
appear before any whitespace or comments or PIs.)

I seem to recall reading that this wasn't intentional, and that when
the WG realised that their published spec had this property, they were
somewhat surprised.  I think that pushes XML firmly into the "hateful"
category, even for people who didn't think it was there already.

From: A. Pagaltzis
Date: 17:15 on 28 Sep 2007
Subject: Re: Ant and oh god don't make me write XML and more YAML

* Aaron Crane <hateful@xxxxxxxxxx.xx.xx> [2007-09-28 10:50]:
> If you concatenate them, you can't tell which PI goes with
> which instance.

Augh. Never mind attributes vs elements: PIs, now there's a wart
that gives vocabulary designers an escape hatch from proper
design.

As for framing, what I've seen in some places is using a form
feed character (which is illegal in XML) to separate concatenated
documents. That's not perfect, but at least it's much better than
using offsets, as you can trivially resynch with a stream if you
miss few packets: just wait long enough and the next FF will come
along.

Regards,

From: Andrew McRae
Date: 08:05 on 28 Sep 2007
Subject: Re: Ant and oh god don't make me write XML and more YAML (was Re: XML Schemas: Some Ground Rules)

On 27 Sep 2007, at 23:52, Michael G Schwern wrote:
> As insane as it is that anyone would pick XML as a human data  
> format.  I'm
> looking at YOU Ant!

"human data format"?

Ant uses XML as a *programming language syntax*. That is completely  
insane.

Happily, the original author of Ant seems to have figured out pretty  
quickly that that was a really dumb idea. Sadly, its other million or  
whatever users don't appear to have noticed yet.

From: Peter da Silva
Date: 04:23 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 27-Sep-2007, at 08:32, Daniel Pittman wrote:
> In other words: SGML would be vastly better without attributes at all,
> ever.  They are a waste of space, time and effort.

They're identical to nested tags, and they're more compact, HOW THE  
HELL ARE THEY A WASTE OF SPACE?

Of course things like <b/bold text/ are also more compact than  
<b>bold text</b>, therefore they got sucked out of XML.

Hateful buggers.

I'm surprised XML retained <tag/> or <tag>foo</> instead of forcing  
<tag></tag> and <tag>foo</tag>.

From: Adam Atlas
Date: 04:38 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules


On 28 Sep 2007, at 23:23, Peter da Silva wrote:
> I'm surprised XML retained [...] <tag>foo</>

That exists?

From: Peter da Silva
Date: 04:47 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 28-Sep-2007, at 22:38, Adam Atlas wrote:
> On 28 Sep 2007, at 23:23, Peter da Silva wrote:
>> I'm surprised XML retained [...] <tag>foo</>

> That exists?

I thought it did. Have I been mislead by assuming that the X in XHTML  
means it's based on XML, or something like that?

From: A. Pagaltzis
Date: 05:23 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

* Peter da Silva <peter@xxxxxxx.xxx> [2007-09-29 05:55]:
> On 28-Sep-2007, at 22:38, Adam Atlas wrote:
>> On 28 Sep 2007, at 23:23, Peter da Silva wrote:
>>> I'm surprised XML retained [...] <tag>foo</>
>
>> That exists?
>
> I thought it did.

No, it does not. The only shorthand in XML is the `<tag/>`
self-closing syntax.

> Have I been mislead by assuming that the X in XHTML means it's
> based on XML, or something like that?

XHTML is indeed based on XML and therefore doesn't have `</>` any
more than XML in general does.

Regards,

From: Peter da Silva
Date: 10:35 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 28-Sep-2007, at 23:23, A. Pagaltzis wrote:
> XHTML is indeed based on XML and therefore doesn't have `</>` any
> more than XML in general does.

Then I guess that mustn't actually be enforced. What a surprise. :)

From: Jonathan Stowe
Date: 11:02 on 29 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On Sat, 2007-09-29 at 04:35 -0500, Peter da Silva wrote:
> On 28-Sep-2007, at 23:23, A. Pagaltzis wrote:
> > XHTML is indeed based on XML and therefore doesn't have `</>` any
> > more than XML in general does.
> 
> Then I guess that mustn't actually be enforced. What a surprise. :)
> 
> 

Well some things enforce it

        bash-3.1$ xmllint -
        <foo>bar</>
        -:1: parser error : Opening and ending tag mismatch: foo line 1 and unparseable
        <foo>bar</>
                   ^

I can't speak for some of the more unspeakable parsers out there
though. 

Let's get back to *hating software*.

/J\

From: A. Pagaltzis
Date: 07:25 on 30 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

* Peter da Silva <peter@xxxxxxx.xxx> [2007-09-29 11:40]:
> On 28-Sep-2007, at 23:23, A. Pagaltzis wrote:
>> XHTML is indeed based on XML and therefore doesn't have `</>`
>> any more than XML in general does.
>
> Then I guess that mustn't actually be enforced. What a
> surprise. :)

That's if you send the document as text/html. In that case, the
browser uses the tagsoup parser, regardless of what the document
says about itself. Usually people use the Appendix C rules from
the XHTML spec to make XHTML work with tagsoup parsing -- which
only works because it doesn't follow SGML rules. If browsers used
SGML parsers, then sending XHTML as text/html would fail.

Basically everyone who puts a "Valid XHTML!" button on their site
should really be using an "Invalid HTML 4.01!" button instead.

However, if you send the page as application/xhtml+xml, then the
browser *will* enforce XML well-formedness. Except that IE does
not understand that MIME type (or XHTML, for that matter) and
will throw up a download dialog instead of rendering the page.
So no one except fringe lunatics like me does that.

It's a giant bucket of suck, mostly because both MSFT and the W3C
have been asleep at the wheel for almost a decade. Hate.

Regards,

From: David Landgren
Date: 08:36 on 17 Nov 2007
Subject: Re: XML Schemas: Some Ground Rules

Peter da Silva wrote:
> On 27-Sep-2007, at 08:32, Daniel Pittman wrote:
>> In other words: SGML would be vastly better without attributes at all,
>> ever.  They are a waste of space, time and effort.
> 
> They're identical to nested tags, and they're more compact, HOW THE HELL 
> ARE THEY A WASTE OF SPACE?
> 
> Of course things like <b/bold text/ are also more compact than <b>bold 
> text</b>, therefore they got sucked out of XML.
> 
> Hateful buggers.

I was looking for some old hate to warm my bones, as I wait for a 
fucking train that's not coming to take me to the French Perl Workshop. 
This thread has been a great comfort.

> I'm surprised XML retained <tag/> or <tag>foo</> instead of forcing 
> <tag></tag> and <tag>foo</tag>.

<tag /> did not exist in SGML.

Instead, XML *invented* <tag /> to work around the abortions in HTML 
that are <hr> and <br>, given that XML declared the omission of closing 
tags to be illegal. Since everyone was trying to coerce HTML into XML, 
these caused big problems in validating the document in the absence of a 
schema.

This syntax was then back-ported to SGML in 1998 or so, in order to 
allow the XML crowd to say "see? XML is a subset of SGML". There were a 
couple of other tweaks that went back this way too, that I have happily 
managed to forget.

David

From: Phil Pennock
Date: 20:56 on 27 Sep 2007
Subject: Re: XML Schemas: Some Ground Rules

On 2007-09-27 at 06:56 -0500, tgies wrote:
> <font>
>  <color>#FF0000</color>
>  <face>Comic Sans MS</face>
>  <text>Hello</text>
> </font>. Oh my God.

<paragraph><sentence><word pos="pronoun">You</word><word pos="verb"
subpos="auxilliary">should</word><word pos="verb">be</word><word
pos="adjective">grateful</word><word pos="conjunctive">that</word>
<word pos="pronoun">this</word><word pos="verb">is</word><word
pos="adverb">not</word><word pos="noun" subpos="proper">MathML</word>
<punctuation>.</punctuation></sentence></paragraph>

<hr/>
&emdash;Phil

Generated at 10:28 on 16 Apr 2008 by mariachi