Search

9/27/2010

HTML5 Custom Data Attributes (data-*)

HTML5 Custom Data Attributes (data-*)

Thanks to HTML5, we now have the ability to embed custom data attributes on all HTML elements. These new custom data attributes consist of two parts:

Attribute Name
The data attribute name must be at least one character long and must be prefixed with 'data-'. It should not contain any uppercase letters.
Attribute Value
The attribute value can be any string.

Using data- attributes with JavaScript

// 'Getting' data-attributes using getAttribute
var plant = document.getElementById('strawberry-plant');
var fruitCount = plant.getAttribute('data-fruit'); // fruitCount = '12'

// 'Setting' data-attributes using setAttribute
plant.setAttribute('data-fruit','7'); // Pesky birds


This method will work in all modern browsers, but it is not how data- attributes are intended to be used. The second (new and improved) way to achieve the same thing is by accessing an element’s dataset property. This dataset property — part of the new HTML5 JavaScript APIs — will return a DOMStringMap object of all the selected element's data- attributes. When using this approach, rather than using the full attribute name, you can ditch the data- prefix and refer to the custom data directly using the name you have assigned to it. Data attribute names which contain hyphens will be stripped of their hyphens and converted to CamelCase.


// 'Getting' data-attributes using dataset
var plant = document.getElementById('sunflower');
var leaves = plant.dataset.leaves; // leaves = 47;

// 'Setting' data-attributes using dataset
var tallness = plant.dataset.plantHeight; // 'plant-height' -> 'plantHeight'
plant.dataset.plantHeight = '3.6m'; // Cracking fertiliser

A gentle introduction to CouchDB for relational practitioners

A gentle introduction to CouchDB for relational practitioners



CouchDB is a document-oriented database written in Erlang that addresses a particular “sweet spot” in data storage and retrieval needs. This blog post is an introduction to CouchDB for those of us who have a relational database background.

A CouchDB database doesn’t have tables. It has a collection of documents, stored in a B+Tree. A document is a collection of attributes and values. Values can be atomic, or complex nested structures such as arrays and sub-documents. When you add a document to a database, CouchDB stores it in the B+Tree, indexed by two attributes with special meaning: _id and _rev.

CouchDB lets you store related data together even if it isn’t all the same type of data; you can store documents representing blog posts, users, and comments — all in the same database. This is not as chaotic as it sounds. To get your data back out of CouchDB in sensible ways, you define views over the database. A view stores a subset of the database’s documents. You can think of them as materialized partial indexes. You can create a view of blog posts, and a view of comments, and so on. Each view is another B+Tree. It stays up-to-date with the changes you make to the database.

You can structure your documents any way you want. There is no fixed schema. If you decide after a while that you want to add tags to your blog posts, you can simply write new posts with a collection of tags and save them into the database. Old posts won’t have tags, but that’s OK; if your application code can read the old format and write the new format, you have an application that doesn’t need a fixed schema.

Updates are never done in-place. Everything is copy-on-write. New revisions are saved into the database as new documents, obsoleting old ones, and CouchDB increments the _rev property each time. To update a document, you fetch it, change it, and send it back, specifying the _id and the most recent _rev. If someone else changed the document in the meantime, your _rev is stale, and your update fails. You must re-fetch and re-save; you can’t lock a document.

CouchDB runs on HTTP and JSON. All of its operations, such as store and retrieve, are standard HTTP requests. The documents themselves are represented in JSON. You can talk directly to CouchDB with curl, Ajax, and anything else that can speak HTTP. There is no “protocol” other than this. CouchDB isn’t just Web-friendly, it is actually made of the same technologies that the Web is made of. You query CouchDB by specifying the database, document ID, view name, and so forth directly in the URL. For example, to fetch a blog post document from the “blog” database, you might issue a GET /blog/helloworld. Queries against views and other objects have simple clean URLs, too.

CouchDB uses special documents, called “design documents,” to store JavaScript code in the database. The code defines the views I mentioned earlier. Another thing you can store is validation functions. This is code that CouchDB executes when you save a document to the database. It accepts a document as input, and can reject it, so you do have control over the schema of documents — it doesn’t have to be a free-for-all. In the blog application, you can have a validation function that starts by enforcing “every document must have a ‘type’ property, and its content must be one of (post,user,comment).” Then you can have separate validation logic for each type of document.

Design documents can also contain something called “show functions.” CouchDB will execute the function’s code in response to HTTP requests to that URL, and send the resulting data back as an HTTP response (as usual). With show functions, you can store entire applications inside the database. Your browser might never even know that it’s talking to a database directly, instead of a web server with a database behind it.

CouchDB isn’t designed for arbitrary queries at runtime. You can only query one view, show function, or database at a time. You can’t do joins. You can’t do arbitrary GROUP BY and ORDER BY. You have to decide in advance what operations you’re going to need, and build views for them. You can then issue requests to those views, essentially the equivalent of key lookups and range scans with a few basic options such as an offset, limit, and reverse order. Now, having said that, you can define views that reduce the database down to aggregates, create a custom ordering, and so on. You can define the equivalent of the relational “project” operation inside your view code.

Here’s how: a view is a map-reduce operation. A view is defined in two parts, the map and the reduce. The map is not optional; it generates the contents of the view. It is a JavaScript function. CouchDB iterates over the database and feeds each document into the function, collects the results, and inserts them into the view’s B+Tree index. Inside the view function’s code, you emit key-value 2-tuples.

* The key will identify the tuple in the index that’s built to store this view. It can be simple or complex, so you can create a view that’s keyed by [this,that,the_other_thing]. The view will be ordered by the same thing; that’s how B+Trees work.
* The value you emit is whatever you want the B+Tree to store at its leaf nodes, and can also be complex (it’s a document, like any other).

The “reduce” part of the operation is optional. It computes what is stored in the non-leaf nodes of the B+Tree index. For example, you can use it to create aggregates, such as summing up counts of comments. In addition to the reduce part of the code, there is a “rereduce”. The rereduce is called as the operation is invoked on higher and higher non-leaf nodes, all the way to the root of the tree. CouchDB knows how to take advantage of the data that’s stored by these reduce and rereduce operations, so for example, it doesn’t necessarily have to descend all the way to the leaf nodes and scan in order to count how many documents match a particular query.

An important thing to know about all this code is that nothing is allowed to have side effects. You can’t modify the database in a view definition, for example. Documents are immutable; it’s all copy-on-write. You get input; you can specify output; that’s it, period. It’s a form of functional programming. Why do we care? Because it keeps things simple and elegant, and enables all kinds of nice properties and functionality, such as replication and eventual consistency and cache expiry and scaling to multiple nodes and so on.

The database file is append-only. Old versions don’t automatically get cleaned up. The database grows forever until you compact it. This process builds a new database and then does a swap-and-discard. The append-only, copy-on-write design makes backups easy, and data corruption unlikely.

CouchDB comes with a “graphical user interface” called Futon. It’s built right into the database, and surprise! — it works through HTTP and Ajax. You just fire up CouchDB, point your Web browser to /_utils, and go. It’s a fun way to explore CouchDB.

With all that in mind, why would you want to use CouchDB instead of a relational database? For most things I’m involved with, I want a relational database. But I got asked recently to help with a database that’ll store records about people. Although nobody has implemented anything yet, it’s a terrible match for a relational database, and an excellent fit for a document-oriented one. The inputs are going to be arbitrary documents with different structures, such as census records, birth records, tax records, estate and probate records, marriage records, and so on. Nobody knows what it’s going to store in the future. When people build “flexible schemas” in relational databases, they usually go for the so-called EAV or EBLOB models. In other words, they aren’t using the database relationally at all, and it simply doesn’t work well. This type of project needs a document-oriented database.

I’ve left out a lot of important details, but the point of this post is to understand the high-level CouchDB concepts and how they’re implemented, so you can reason for yourself about it. If you’ve read this far and you think that CouchDB might be a good fit for your needs, I encourage you to take a look at CouchDB, The Definitive Guide.

9/24/2010

Performance of Greedy vs. Lazy Regex Quantifiers

Regular Expressions Cookbook

Problem
Match a pair of <p> and </p> XHTML tags and the text between them. The text between
the tags can include other XHTML tags
Solution
<p>.*?</p>
Take a look at one incorrect solution for the problem in this recipe:
<p>.*</p>
After matching the first <p> tag in the subject, the engine reaches ‹.*›. The dot matches
any character, including line breaks. The asterisk repeats it zero or more times. The
asterisk is greedy, and so ‹.*› matches everything all the way to the end of the subject
text. Let me say that again: ‹.*› eats up your whole XHTML file, starting with the first
paragraph.
When the ‹.*› has its belly full, the engine attempts to match the ‹<› at the end of the
subject text. That fails. But it’s not the end of the story: the regex engine backtracks.
The asterisk prefers to grab as much text as possible, but it’s also perfectly satisfied to
match nothing at all (zero repetitions). With each repetition of a quantifier beyond the
quantifier’s minimum, the regular expression stores a backtracking position. Those are
positions the engine can go back to, in case the part of the regex following the quantifier
fails.
When ‹<› fails, the engine backtracks by making the ‹.*› give up one character of its
match. Then ‹<› is attempted again, at the last character in the file. If it fails again, the
engine backtracks once more, attempting ‹<› at the second-to-last character in the file.
This process continues until ‹<› succeeds. If ‹<› never succeeds, the ‹.*› eventually runs
out of backtracking positions and the overall match attempt fails.
If ‹<› does match at some point during all that backtracking, ‹/› is attempted. If ‹/› fails,
the engine backtracks again. This repeats until ‹</p>› can be matched entirely.
So what’s the problem? Because the asterisk is greedy, the incorrect regular expression
matches everything from the first <p> in the XHTML file to the last </p>. But to correctly
match an XHTML paragraph, we need to match the first <p> with the first </p> that
follows it.
That’s where lazy quantifiers come in. You can make any quantifier lazy by placing a
question mark after it: ‹*?›, ‹+?›, ‹??›, and ‹{7,42}?› are all lazy quantifiers.

Lazy quantifiers backtrack too, but the other way around. A lazy quantifier repeats as
few times as it has to, stores one backtracking position, and allows the regex to continue.
If the remainder of the regex fails and the engine backtracks, the lazy quantifier
repeats once more. If the regex keeps backtracking, the quantifier will expand until its
maximum number of repetitions, or until the regex token it repeats fails to match.


Performance of Greedy vs. Lazy Regex Quantifiers
Consider the following simple example: When the regexes <.*?> and <[^>]*> are applied to the subject string "<0123456789>", they are functionally equivalent. The only difference is how the regex engine goes about generating the match. However, the latter regex (which uses a greedy quantifier) is more efficient, because it describes what the user really means: match the character "<", followed by any number of characters which are not ">", and finally, match the character ">". Defined this way, it requires no backtracking in the case of any successful match, and only one backtracking step in the case of any unsuccessful match. Hand-optimization of regex patterns largely revolves around the ideas of reducing backtracking and the steps which are potentially required to match or fail at any given character position, and here we've reduced both cases to the absolute minimum.


jQuery Regular Expressions Review (Rev:20100921_1700)
/color|date|datetime|email|hidden|month|number|
password|range|search|tel|text|time|url|week/i
/\b(?:color|date|datetime|email|hidden|month|number|
password|range|search|tel|text|time|url|week)/i
In a manner similar to the previous example, adding (or "exposing") a word boundary anchor to the beginning of this regex improves the efficiency in the case of non-matches, by reducing the number of positions within the string where the regex engine attempts a match. Instead of attempting a match at every location within a target string, the engine now only needs to check on word boundaries.

9/19/2010

pdf crop

新的 6 吋 Kindle 發表了(Kindle 3) - Mobile01 討論群組

關於用 Kindle DX 來看 PDF,有多餘白色區域的部份,
我是用 A-PDF Page Crop 3.4 這個程式,這雖然是要錢的,
但去下載試用版也只是在第一頁會加浮水印,其他頁面沒有影響,
也沒看它有時間限制。


SourceForge.net: briss
This project was registered on SourceForge.net on May 4, 2010, and is described by the project team as follows:

This project aims to offer a simple cross-platform application for cropping PDF files. A simple user interface lets you define exactly the crop-region by fitting a rectangle on the visually overlaid pages.

9/13/2010

Microdata: HTML5’s Best-Kept Secret

Microdata: HTML5’s Best-Kept Secret


<div itemscope itemtype="http://data-vocabulary.org/Organization">
<h1 itemprop="name">Hendershot's Coffee Bar</h1>
<p itemprop="address" itemscope itemtype="http://data-vocabulary.org/Address">
<span itemprop="street-address">1560 Oglethorpe Ave</span>,
<span itemprop="locality">Athens</span>,
<span itemprop="region">GA</span>.
</p>
</div>


The Microdata markup adds a couple attributes you may not have seen before, itemscope, itemtype and itemprop. The first is essentially just a top level marker, it tells the search engine spider that you’re about to define something in the following nested tags. The itemtype attribute tells the spider what you’re defining — in this case, an organization.

The rest of the markup should look pretty familiar if you’ve used Microformats. The main change is the itemprop attribute (short for item property) to define what each element is. Because our address is all one paragraph, we’ve added some span tags to define each element of the address separately — street address, locality and so on. If we wanted, we could add other properties like a phone number (itemprop="tel"), a URL (itemprop="url") or even geodata (itemprop="geo").

So where did we get these itemprop vocabularies from? Well, as the URL in the itemtype attribute indicates, they come from data-vocabulary.org. Of course you can make up your own itemprop syntax, but if you want search engine spiders to understand your microdata, you’re going to have to document what you’re doing. Since the definitions at data-vocabulary.org cover a number of common use cases — events, organizations, people, products, recipes, reviews — it makes a good starting point.

Microformats and RDFa
Actually, the reasoning seems to have been something like this: Microformats are a really good idea, but essentially a hack. Because Microformats rely only on the class and rel attributes, writing parsers to read them is complicated.

At the same time, RDFa was designed to work with the now-defunct XHTML 2.0 spec. Although RDFa is being ported to work with HTML5, it can be overly complex for many use cases. RDFa is a bit like asking what time it is and having someone tell you how to build a watch. Yes, RDFa can do the same things HTML5 microdata and Microformats do (and more), but if the history of the web teaches us a lesson, it’s that simpler solutions almost always win.


google - 複合式摘要與結構化標記 - 網站管理員工具說明

* Using Microformats in HTML5
* Where on the Web Is HTML5?
* Add Semantic Value to Your Pages With HTML 5
* Microformats are Awesome, Now Put Them to Work for Your Site

9/10/2010

yui synchronousCall


function synchronousCall() {
var args = [].slice.call(arguments);
var iter = function (args) {
var fn = args.shift();
if (fn && fn.call()) { // return true means no ajax call is made, excute next
iter(args);
}
};

iter(args);

Connect.completeEvent.subscribe(function (type, data, args) {
console.log('completeVent', args)
iter(args);
}, args);
}

Internet Explorer Developer Toolbar


Internet Explorer Developer Toolbar

creating, understanding, and troubleshooting Web pages.
6/7/2010

9/06/2010

align checkboxes, radios, text inputs with their label

/* align checkboxes, radios, text inputs with their label
by: Thierry Koblentz tjkdesign.com/ez-css/css/base.css */
input[type="radio"] { vertical-align: text-bottom; }
input[type="checkbox"] { vertical-align: bottom; *vertical-align: baseline; }
.ie6 input { vertical-align: text-bottom; }


via: http://html5boilerplate.com/

matchit.zip - extended % matching for HTML, LaTeX, and many other languages : vim online

matchit.zip - extended % matching for HTML, LaTeX, and many other languages : vim online

把玩 C 語言的時候利用 % 指令在括弧間跳躍是相當爽的,安裝這鬼玩意後,html的tag也可以 % 跳躍了 #ifdef #endif 也可以用 % 跳躍了..

9/05/2010

Media queries

Media queries

Media queries consist of a media type and one or more expressions, involving media features, which resolve to either true or false. The result of the query is true if the media type specified in the media query matches the type of device the document is being displayed on and all expressions in the media query are true.


media_query_list: [, ]*
media_query: [[only | not]? [ and ]*]
| [ and ]*
expression: ( [: ]? )
media_type: all | aural | braille | handheld | print |
projection | screen | tty | tv | embossed
media_feature: width | min-width | max-width
| height | min-height | max-height
| device-width | min-device-width | max-device-width
| device-height | min-device-height | max-device-height
| aspect-ratio | min-aspect-ratio | max-aspect-ratio
| device-aspect-ratio | min-device-aspect-ratio | max-device-aspect-ratio
| color | min-color | max-color
| color-index | min-color-index | max-color-index
| monochrome | min-monochrome | max-monochrome
| resolution | min-resolution | max-resolution
| scan | grid


CSS Media Queries: Bees Knees Or Spawn of Satan?

<link rel="stylesheet" type="text/css"
media="screen and (max-device-width: 480px)"
href="shetland.css" />

Responsive Web Design

http://caniuse.com/

http://caniuse.com/

Professional JavaScript

Professional JavaScript

functional libraries
* Underscore.js http://documentcloud.github.com/underscore/
* Functional JavaScript http://osteele.com/sources/javascript/functional/

remove spacing for firefox inline-block elements

http://yuiblog.com/sandbox/yui/3.2.0pr1/build/cssgrids/grids.css


.yui3-g {
letter-spacing: -0.31em; /* webkit: collapse white-space between units */
*letter-spacing: normal; /* reset IE < 8 */
word-spacing: -0.43em; /* IE < 8 && gecko: collapse white-space between units */
}

cleaner:

<ul><li>
stuff...
</li><li>
more stuff
</li><li>
ok, enough stuff, already.
</li></ul>

sample: http://chunghe.googlecode.com/svn/trunk/experiment/inline-block/inline-block-firefox.htm

CSS inline/block nuances
White Space Collapsing: the 'white-space-collapse' property - it hasn't been implemented by any vendor as of yet.

9/02/2010

[美食] 台北東區。傑夫的燒肉 @ Masako的抹茶福氣家 :: 痞客邦 PIXNET ::

[美食] 台北東區。傑夫的燒肉 @ Masako的抹茶福氣家 :: 痞客邦 PIXNET ::

地址:忠孝東路四段170巷6弄7號
(忠孝敦化捷運站5號出口,神旺飯店正後方巷子)

電話:02-27735985
價位:399 / 499 (部份菜色不同) 開幕期間免加1成服務費,之後不確定